Simple life, Complicated mind: April 2009

Wednesday, April 29, 2009

template auto run script

<?php

$startRow = 0;
$rowPerPage = 10;
$totalRow = 111;


$runCycle = (int)($totalRow / $rowPerPage);

run_yahoo($startRow, $rowPerPage, $totalRow);

function run_yahoo($startRow, $rowPerPage, $totalRow) {
  $runCycle = (int)($totalRow / $rowPerPage);

  if (($totalRow % $rowPerPage) != 0) {
    $runCycle++;
  }

  $nextStartRow = NULL;

  for ($i = 0; $i < $runCycle; $i++) {
    $nextStartRow = $nextStartRow ? $nextStartRow : $startRow;
    $nextStartRow = _get_yahoo_data($nextStartRow, $rowPerPage, $totalRow);
  }
}

function _get_yahoo_data($startRow, $rowPerPage, $totalRow) {
  $nextStartRow = $startRow + $rowPerPage;
  echo 'LIMIT ' . $startRow . ', ' . $nextStartRow . '
';

  return $nextStartRow;
}

?>

Tuesday, April 28, 2009

change the default layout output of a node

Node theming by field: deconstructing $node->content in Drupal 5
Submitted by jody on February 11, 2008 - 11:53pm

* drupal
* theming

The CCK ‘display fields’ settings are very useful for theming nodes by content type, but I often find myself having to get further into customizing node output field by field.

Once I decide there’s no way getting around taking finer control of my node’s fields, I create a node-{type}.tpl.php by copying my existing node.tpl.php and open up Devel’s ‘Node Render’ tab. Then I delete the $content variable from the template and start adding things like

<?php
print $node->content['field_image']['#value'];
?>

and

<?php
print $node->content['body']['#value'];
?>

The $content variable is nice for amateurs, but we need the unrendered $node->content to get good control of a node.

While this gives me complete control over the output of the node’s fields, the drawback is that now if I add a new field to the node, or if I add a module that adds something to $node->content, I have to go back to my template and add in this new element. Because I often do my theming and development in parallel, this can be rather annoying, and there is also a danger that I could overlook doing it.

Therefore I think it may be more practical to use code like this in the node template:

<?php
foreach($node->content as $key => $field) {
    print $field['#value'];   
  } 
?>

Instead of printing each field that is wanted in the template, I can instead unset the fields that are not wanted and then print all the fields. This is faster to write and it lets any new fields show up. After unsetting unwanted fields it looks like this:

<?php
unset($node->content['field_dumb_field']);
unset($node->content['field_annoying']);
unset($node->content['#children']);

foreach($node->content as $key => $field) {
    print $field['#value'];   
  } 
?>

(I found that I need to unset $node->content[‘#children’] to get this method to work correctly.)

Now if you want to control the output of specific fields individually you could add a switch:

<?php
unset($node->content['field_dumb_field']);
unset($node->content['field_annoying']);
unset($node->content['#children']);

foreach($node->content as $key => $field) {
  switch ($key) {
    case 'body':
      print '<div class="hot-body">'. $field['#value'] .'</div>';
      break;
    case 'field_awesome':
      print '<div class="special-markup">'. $field['#value'] .'</div>';
      break;
    default:
      print $field['#value'];   
      break;
  }   
} 
?>

Now I know that these template files are not supposed to get all mucked up with php, so this method is really just a hack and won’t be appreciated if you are working with a designer who dislikes php. But if you, like me, were finding yourself dissecting $node->content into crazy pieces with tons of php in your node templates already, perhaps this method could be a slight improvement.
Comments
sirkitree (not verified) on February 12, 2008 - 3:40am

Yes it is better to get this done in template.php and then have your .tpl file clean for a designer, but this is a key step that should be learned on the way there. After that, if you find your template.php file getting overcrowded, you can split out your $vars into node-[content-type].vars.php.

* reply

jody on February 12, 2008 - 10:09am

I wonder if we could combine these two methods. In _phptemplate_variables maybe we could assign all the fields we want as $vars and also assign a variable to represent all the fields we neither assigned nor unset, like a catch-all.

* reply

adrinux (not verified) on February 12, 2008 - 6:25am

Fantastic write up! I think I’ll be using this.

* reply

moshe weitzman (not verified) on February 12, 2008 - 8:14am

very nice writeup. we want to improve this experience for themers in D7. they will have to use a few simple php functions but the temptation to override $content will be much smaller. we will introduce render() and hide() so you can put pieces of $content where you want without wrecking future enhancements. your writeup is current state of the art though - thanks for sharing.

* reply

jody on February 12, 2008 - 10:00am

Eaton’s goals in that thread seem to be exactly right - I’m really glad to see people are working hard on this issue.

* reply

Alan Burke (not verified) on February 12, 2008 - 8:45am

There’s a better way
See the theme folder of the CCK module
There’s a readme txt file which explains how to have individual template files for each CCK field.

Regards
Alan

I Thought I had added it to Drupal.org…but I can’t find it now

* reply

jody on February 12, 2008 - 9:36am

Hi Alan,

I checked out that CCK theme README and yes it does explain how to make template files for individual fields. I think that most of the time that won’t be helpful to me. A good example of what I’m most often trying to do with my fields is to just stick a few of them in the same div in the node template. I think a better example of what I’m doing may be:

<?php
unset($node->content['field_dumb_field']);
unset($node->content['field_annoying']);
unset($node->content['#children']);
?>

<?php
print $node->content['field_a'];
unset($node->content['field_a']);
print $node->content['field_b'];
unset($node->content['field_b']);
?>

</div>

</div class ="evertything-else">

<?php
foreach($node->content as $key => $field) {
    print $field['#value'];  
  }
?>

</div>

* reply

Jacine (not verified) on February 12, 2008 - 8:33pm

Hi Jodi!

This is great. Thank you for posting.

I have a quick question. How would you do this for CCK fields that are inside a fieldgroup?

Thanks again :)

* reply

jody on February 13, 2008 - 8:44pm

That’s a good question- using fieldgroups makes this even more complex. Here’s an idea showing some fields from a fieldgroup being explicitly displayed, and then anything else in the fieldgroup gets printed last.

<?php
unset($node->content['#children']);
?>

<?php
// A regular field
print $node->content['field_a'];
unset($node->content['field_a']);
?>

</div>

<div class ="group-special">

<?php
//All fields within a certain fieldgroup
foreach($node->content['group_special'] as $key => $groupfield) {
  if ($groupfield['#value']) {
    print $groupfield['#value'];
  }
}
unset($node->content['group_special']);
?>

</div>

</div class ="everything-else">

<?php
//All other fields including those in other fieldgroups
foreach($node->content as $key => $field) {
    if ($field['#value']) { print $field['#value']; }
    else {
      foreach($field as $groupfield) {
        if ($groupfield['#value']) { print $groupfield['#value']; }
      }
    }
  }
?>

</div>

* reply

Jacine (not verified) on February 15, 2008 - 8:56pm

Thank you Jody! I was struggling with the fieldsets for quite a while. This is perfect :)

* reply

John (not verified) on March 3, 2008 - 12:45am

Thank you for this info. I have used it in a site and it has been helpful. I ran into an issue that unfortunately limits its usefulness however, in that the node content bit that I have pulled into a block is not rendering PHP. I have PHP enabled as the input format on both the node and the block. The PHP [a call to base_path()] shows in the source but is not processed as PHP. Has anyone experienced this and found a solution?

* reply

John (not verified) on March 12, 2008 - 1:38pm

I think i was trying to do too much with this. While i never did really find an answer to my issue, i did resolve the situation by using “CCK Blocks” to get my additional fields into blocks that would evaluate PHP.

So your technique was helpful in that i could add fields to a node but keep some of them from displaying, for example in a typical content-in-center-column layout. Then through CCK Blocks i could use those fields elsewhere on the page (left column in my case).

Thanks again for taking the time to post this helpful technique.

* reply

jody on March 12, 2008 - 1:46pm

I would be concerned if PHP was not evaluating in my blocks. I hope that you did the obvious and enclosed it in php tags. I have never heard of this issue.

As regards simply not showing certain of your cck fields, you can take care of this by visiting the ‘display settings’ tab of your content type’s administration. The technique posted here is intended for more advanced theming needs.

* reply

Martijn (not verified) on March 19, 2008 - 8:48am

Hi Jody,

Excellent writeup. This helps a lot with Drupal theming!
You yourself have a great theme also!

greetings,
Martijn

* reply

Chacha (not verified) on March 25, 2008 - 4:39pm

Hey!! So good to find your website. This write up is great. I am now going to try unset on my dumb annoying fields right now!

* reply

Marcus (not verified) on May 30, 2008 - 6:57pm

Jody,

Thanks for posting these node theming tips. The example code you set forth has been immensely helpful.

Much appreciated,
Marcus

* reply

Marcus (not verified) on May 30, 2008 - 6:58pm

Hi, Jody,

Thanks for posting these excellent tips on how to theme nodes.

Much appreciated,
Marcus

* reply

Alter table - 1205 lock wait timeout exceeded

Alter table - 1205 lock wait timeout exceeded

How can i alter my timeout setting
Hey there,
i have a rather long script that makes several connections to a MySQL database. Sometimes , if there is a problem with the script, or a bug in my code, or something else unexpected, the connections dont get closed. Is there a way that i can set MySQL to percieve when a thread has been idle for like 5 minutes and kill it for me? Like a timeout function.
i have googled and searched around here, but i am having trouble finding out just where to do this.
i have MySQL-Adminstrator and phpmyadmin that i can use if its any easier.
thanks

It does. The wait_timeout setting is set to 28,800 seconds and MySQL will kill inactive threads after this. You can set it while MySQL is running with "SET wait_timeout = $value;", or set it directly in your my.ini file.

--Simon

SET wait_timeout = 900;
ALTER TABLE `stock_db`.`stock_coredata_usmqtr` ADD INDEX `MGStockNumberIndustryID_InterimEndingDate` USING BTREE(`MGStockNumberIndustryID`, `InterimEndingDate`);

http://forums.devshed.com/mysql-help-4/how-can-i-alter-my-timeout-setting-313101.html

=====

Try bulk loading instead it is better
optimized for cluster
LOAD DATA INFILE ;

BR
-- Martin

======

got to the problem...

the problem was in the my.ini file (for windows only) that was to be
edited and not the config.ini. in this file we have to set the timeout
by setting innodb_lock_wait_timeout.

======
The InnoDB error codes page lists the following information:

1205 (ER_LOCK_WAIT_TIMEOUT)

Lock wait timeout expired. Transaction was rolled back.

So how can we prevent that from happening?

This happens because the value of innodb_lock_wait_timeout was not enough for InnoDB to acquire the lock. On my servers it was set to default: 50.

The manual has following information:

The timeout in seconds an InnoDB transaction may wait for a lock before being rolled back. InnoDB automatically detects transaction deadlocks in its own lock table and rolls back the transaction. Beginning with MySQL 4.0.20 and 4.1.2, InnoDB notices locks set using the LOCK TABLES statement. Before that, if you use the LOCK TABLES statement, or other transaction-safe storage engines than InnoDB in the same transaction, a deadlock may arise that InnoDB cannot notice. In cases like this, the timeout is useful to resolve the situation. The default is 50 seconds.

Monday, April 27, 2009

Get filename in php command line

<?php
### Get filename in php command line cli mode
echo __FILE__ . '
';
echo $PHP_SELF;
?>

Quick and Dirty MySQL Backup

Quick and Dirty MySQL Backup
Until recently, the MySQL databases I work with contain data that can be retrieved from other sources. Most of the data is either batched in from flat files or another database. It would be inconvenient to reload a couple months worth of data, but since these databases are not mission critical, the business could operate without them for a couple days. Lately, we've been implementing some semi-critical systems that rely on a somewhat expedient recovery.

The requirements for the project were that the database must remain up during the backup and losing a day's worth of data was acceptable. All of my regular Oracle readers are cringing at the moment, but hey, that was the rules I was working with.

My first thought was to use mysqlhotcopy because it backed up the actual physical files. However, mysqlhotcopy only allows you to backup MyISAM tables and we extensively use InnoDB.

My next choice was mysqldump. mysqldump basically takes the entire database and dumps a text file containing DDL and DML that will re-create your database. Coming from an Oracle background, I knew there were shortcomings to dumping the entire database, but hopefully I could mitigate them.

The first hurdle was security. I specifically turn off unauthenticated root access on my databases, but I needed to be able to read all the tables to do a backup. I don't want to hard-code my root password or any password in a script as I don't have suicidal tendencies (diagnosed, anyway). So I created a user called backup that could only login from the server machine, but could login unauthenticated.

The next thing I had to figure out was how to get a consistent view of the data. I knew that my developers preferred InnoDB for it's Referential Integrity features and getting inconsistent data would be disasterous. Fortunately, one of the flags to mysql_dump is the --single-transaction which essentially takes a snapshot in time.

So I wrote a script around mysql_dump and --single-transaction and dumped my entire database to disk. Every now and again, I encountered an "Error 2013: Lost connection to MySQL server during query when dumping table `XYZ` at row: 12345". The row number changed each time, so I figured it had something to do with either activity in the database or memory. I could rerun the command and it usually finished the second or third time.

After the third day straight of my backup failing, I decided to research it a little more. mysql_dump has a flag called --quick which bypasses the cache and writes directly to disk. I put this flag in my backup script and the script started finishing more consistently.

The last hurdle was having enough space on disk to store my backups. Since the backup file is really a text file, I decided to pipe the output through gzip to reduce it's size.

Currently, my quick and dirty backup script is a wrapper around the following command:

mysqldump --all-databases --quick --single-transaction -u backup | gzip > mybackup.sql.gz

We're adopting MySQL at a blistering pace, so I'm sure I'll need to make changes in the future. For right now, though, it gets the job done.
Posted by Jeff Hunter at 8:43 PM
10 comments:

Anonymous said...

Hi,

Can you post SQL string you used to create backup user please.

Roland
11/07/2006 06:56:00 AM
gandolf989 said...

Under what situations is mySQL better than Oracle?
11/07/2006 10:10:00 AM
Anonymous said...

While your Database is still relatively small, you might want to look into replication, and backing up the replicated machine.

The Replicated machine will contain an exact snapshot of the live machine in case it goes down, and the backups of the replicated machine are there in case someone accidently deletes or over writes data.

/bryan
11/07/2006 10:43:00 AM
Anonymous said...

Not great I'm afraid. This will cause MyISAM tables to lock for the duration of the dump and will take ages to do innodb tables. You sohuld really consider moving everything to InnoDB and buying InnoDB hot Backup.
11/07/2006 12:20:00 PM
Jeff Hunter said...

This will cause MyISAM tables to lock for the duration of the dump and will take ages to do innodb tables. You sohuld really consider moving everything to InnoDB and buying InnoDB hot Backup.
We don't use MyISAM and I don't think 15 minutes is ages. I've investigated InnoDB backup and when the time is right, we'll probably get it. But right now, the quick and dirty method works well with my business rules.
11/07/2006 06:08:00 PM
Anonymous said...

--opt helps :)

why don't you buy innodb hot backup?
11/08/2006 02:11:00 PM
Jeff Hunter said...

why don't you buy innodb hot backup?
When the data becomes more critical and the MTTR becomes shorter, I certainly will.
11/09/2006 10:39:00 AM
Sheeri said...

firstly, --opt is the default for mysqldump. It's always used unless you specifically use --skip-opt.

Secondly, why not use Zmanda's hot backup for MySQL? It's free and open source, and quick to set up. And best of all, it doesn't use a proprietary format.

http://www.zmanda.com/backup-mysql.html
http://www.zmanda.com/zrm-mysql-community.html
11/11/2006 08:57:00 AM
Anonymous said...

As Roland said above, what was the command to create the backup user?
12/25/2006 02:44:00 AM
Anonymous said...

What if you want to restore only one database? That would be a pain using --all-databases and gzipping everything.

I wrote a small PHP script to handle all my backup needs, I also feel that it's enough with a nightly backup and it's ok to lose one day of databases if we have a crash (at least so far), so I made this script: http://iblins01.quidnovo.se/~sebjma/backup.phps

The script is called via crontab during low traffic times (usually at night) and will produce gzipped files for each database on the system. Backups older than a predefined timelimit is deleted from the filesystem.

I hope that someone else might find this handy!

php code template

<?php
// ### Add by Danny Hsieh. 2009-03-07 15:06:27.
error_reporting(E_ALL);
ini_set('display_errors', TRUE);
ini_set('display_startup_errors', TRUE);

ini_set('max_execution_time', 0); 

include('./../common.php');

echo '[START] ' . date('Y-m-d h:i:s') . "\n";

$conn = new mysqli($db_server, $db_user, $db_pass);

/* check connection */
if ($conn->connect_errno) {
  printf("Connect failed: %s\n", $conn->connect_error);
  exit();
}

/* change db to your db */
$conn->select_db($db_name['danny_testing']);

mb_internal_encoding("UTF-8");
$conn->query("SET NAMES 'utf8'");
$conn->query("SET CHARACTER SET 'utf8'");

### SET AUTOCOMMIT to 0 (for speeding up the process)
$conn->query("SET autocommit=0;");

$sql = "SELECT ";
$sql .= " ";
$sql .= " ";
$rs = $conn->query($sql);
if ($conn->error) { die('Invalid query: ' . $conn->error); }

if ($rs) {

  $rowCount = 0;
  while ($row = $rs->fetch_array()) {

    ### COMMIT (save) the data every 5000 rows
    if ($rowCount % 5000 == 0) {
      $conn->query('COMMIT;');
    }

    $rowCount++;
  }
}

### COMMIT (save) the rest of rows.
$conn->query('COMMIT;');

$rs->close(); // 另一種是 $conn->free_result

  $rowData = array(
    'field_name' => $row['field_name'],
  );

//    $conn->query(sql_rewrite_insert('the_table_name', $rowData));
//    echo sql_rewrite_insert('the_table_name', $rowData) . '
';
//    if ($conn->error) { die('Invalid query: ' . $conn->error); }
//    if ($conn->error) { printf("Invalid query: %s\n", $conn->error) . '
'; continue; };
//
//    $whereFieldArr = array(
//      'id' => $id_list[$key],
//    );
//    $conn->query(sql_rewrite_update('the_table_name', $rowData, $whereFieldArr));
//    echo sql_rewrite_update('the_table_name', $rowData, $whereFieldArr) . '
';
//    if ($conn->error) { die('Invalid query: ' . $conn->error); }
//    if ($conn->error) { printf("Invalid query: %s\n", $conn->error) . '
'; continue; };
    
$conn->close();



echo '[END] ' . date('Y-m-d h:i:s') . "\n";

echo 'Done!';
?>

Saturday, April 25, 2009

Using Per-Table Tablespaces

Using Per-Table Tablespaces

You can store each InnoDB table and its indexes in its own file. This feature is called “multiple tablespaces” because in effect each table has its own tablespace.

Using multiple tablespaces can be beneficial to users who want to move specific tables to separate physical disks or who wish to restore backups of single tables quickly without interrupting the use of other InnoDB tables.

To enable multiple tablespaces, start the server with the --innodb_file_per_table option. For example, add a line to the [mysqld] section of my.cnf:

[mysqld]
innodb_file_per_table

With multiple tablespaces enabled, InnoDB stores each newly created table into its own tbl_name.ibd file in the database directory where the table belongs. This is similar to what the MyISAM storage engine does, but MyISAM divides the table into a tbl_name.MYD data file and an tbl_name.MYI index file. For InnoDB, the data and the indexes are stored together in the .ibd file. The tbl_name.frm file is still created as usual.

You cannot freely move .ibd files between database directories as you can with MyISAM table files. This is because the table definition that is stored in the InnoDB shared tablespace includes the database name, and because InnoDB must preserve the consistency of transaction IDs and log sequence numbers.

If you remove the innodb_file_per_table line from my.cnf and restart the server, InnoDB creates tables inside the shared tablespace files again.

The --innodb_file_per_table option affects only table creation, not access to existing tables. If you start the server with this option, new tables are created using .ibd files, but you can still access tables that exist in the shared tablespace. If you start the server without this option, new tables are created in the shared tablespace, but you can still access any tables that were created using multiple tablespaces.
Note

InnoDB always needs the shared tablespace because it puts its internal data dictionary and undo logs there. The .ibd files are not sufficient for InnoDB to operate.

To move an .ibd file and the associated table from one database to another, use a RENAME TABLE statement:

RENAME TABLE db1.tbl_name TO db2.tbl_name;

If you have a “clean” backup of an .ibd file, you can restore it to the MySQL installation from which it originated as follows:

1.

Issue this ALTER TABLE statement to delete the current .ibd file:

ALTER TABLE tbl_name DISCARD TABLESPACE;

2.

Copy the backup .ibd file to the proper database directory.
3.

Issue this ALTER TABLE statement to tell InnoDB to use the new .ibd file for the table:

ALTER TABLE tbl_name IMPORT TABLESPACE;

In this context, a “clean” .ibd file backup is one for which the following requirements are satisfied:

*

There are no uncommitted modifications by transactions in the .ibd file.
*

There are no unmerged insert buffer entries in the .ibd file.
*

Purge has removed all delete-marked index records from the .ibd file.
*

mysqld has flushed all modified pages of the .ibd file from the buffer pool to the file.

You can make a clean backup .ibd file using the following method:

1.

Stop all activity from the mysqld server and commit all transactions.
2.

Wait until SHOW ENGINE INNODB STATUS shows that there are no active transactions in the database, and the main thread status of InnoDB is Waiting for server activity. Then you can make a copy of the .ibd file.

Another method for making a clean copy of an .ibd file is to use the commercial InnoDB Hot Backup tool:

1.

Use InnoDB Hot Backup to back up the InnoDB installation.
2.

Start a second mysqld server on the backup and let it clean up the .ibd files in the backup.

Previous / Next / Up / Table of Contents

User Comments
Posted by Ryan Gerry on October 4 2005 7:54pm [Delete] [Edit]

Note that the advice above regarding the DISCARD TABLESPACE and IMPORT TABLESPACE only works when you restore the .ibd file to the "installation from which it originated." If you try to copy ibd files into a new database instance you will get an error like this:

InnoDB: Error: tablespace id in file './test/<ibd_file_name.ibd>' is 38, but in the InnoDB data dictionary it is 401.

When you run:

ALTER TABLE <table_name> IMPORT TABLESPACE

It is possible to copy InnoDB tables between two databases using mysqldump and mysqlimport.
Posted by Ricardo Marques on May 24 2006 5:48pm [Delete] [Edit]

Note that .ibd files are always single auto-extending files.

So it's NEITHER possible to set an initial size for .ibd files NOR to set a maximum file size for them.

This information was posted by Heikki Tuuri (creator of the InnoDB transactional storage engine) in the InnoDB forum of the MySQL web site, in January 2006, at:

http://forums.mysql.com/read.php?22,26788,63040#msg-63040
Posted by Sunny Walia on April 4 2007 1:46am [Delete] [Edit]

I am actually using this feature to allow me to maintain disk usage. Here is details on what I am trying to achieve. Hopefully it will help somebody. http://crazytoon.com/2007/04/03/mysql-ibdata-files-do-not-shrink-on-database-deletion-innodb/
Posted by Dario Souza on September 19 2007 4:00am [Delete] [Edit]

If you need to move some or all of your tables to separate .ibd files, you may try the following:

1) add "innodb_file_per_table" option to my.cnf
2) restart mysqld
3) do an ALTER TABLE <table> ENGINE=InnoDB; on each table you want to move apart. This will recreate the table in its separate .ibd file.

The reverse also applies. Coalescing and/or moving apart tables only requires you to restart mysqld with or without the innodb_file_per_table option flag.
Posted by murtuja khokhar on March 28 2008 8:52am [Delete] [Edit]

Thanks Dario Souza,

This query will generate ALTER TABLE ENGINE=InnoDB; type statements for all tables.

select concat(concat('alter table ',table_name),' ENGINE=InnoDB;') FROM INFORMATION_SCHEMA.tables
where table_type='BASE TABLE'
Posted by Hilay Selivansky on June 7 2008 6:14am [Delete] [Edit]

I think this SQL would do a better job, following murtuja khokhar advice:

select concat('alter table ',TABLE_SCHEMA ,'.',table_name,' ENGINE=InnoDB;') FROM INFORMATION_SCHEMA.tables
where table_type='BASE TABLE' and engine = 'InnoDB'

As not all tables are in the same schema and also not InnoDB based.

Hilay Selivansky
הילי זליבנסקי

Posted by Brian Demant on February 5 2009 8:36pm [Delete] [Edit]

the following will also do the trick

mysqlcheck --optimize --all-databases
Posted by Rishi Pande on April 14 2009 8:31am [Delete] [Edit]

Just in case, you can also use:

innodb_file_per_table = 1

..in your my.cnf file under [mysqld] section. It has the same effect as just "innodb_file_per_table".

Differences Between innodb_data_file_path and innodb_file_per_table

Differences Between innodb_data_file_path and innodb_file_per_table
By: Keith Murphy

Recently, a customer wondered if they should start using the innodb_file_per_table option, or if they should continue to use the large InnoDB tablespace files created by the innodb_data_file_path option in the my.cnf option file.

Many people still use the older innodb_data_file_path option because it is the default for MySQL server. So, what are the benefits of using innodb_file_per_table instead?

The innodb_file_per_table makes for easier-to-manage files. With this option each InnoDB table has its own data and index file under the database directory. As an example, if you had table foo located in database xyz the InnoDB data file for table foo would be /var/lib/mysql/data/xyz/foo.idb. Each table would have its own idb table in the appropriate database directory. This is in contrast to using the innodb_data_file_path option with (typically) one large file in the root of your data directory. For example, it might be /var/lib/mysql/data/ibdata1.idb. All table data and indexes would be stored in this one file, and it can be very largeÂ and unwieldy. I don’t recall the largest ibdata file I have seen, but what do you do if you have a 100 gig InnoDB tablespace file? It can, and does, happen. The file contains what amounts to all the data of all your databases on the server.

Also, with the innodb_file_per_table option, you know what table is in each file.Â You know that table foo is in the foo.idb file.Â No other table’s data or index information is going to be in there.Â There is still a shared tablespace but it is going to be much smaller and more manageable.Â It contains the shared meta-data of all the databases, not individual table data.Â With the shared tablespace option, all data is typically going to be in one file, and you can’t differentiate between databases or tables.

Are there any advantages to using innodb_data_file_path? Glad you asked! There is one.Â You can specify multiple file paths using multiple innodb_file_per_table options in my.cnf, and “spread the load” over multiple partitions or hard drives.Â That is the theory.Â In practice, it is very rarely worth it.

Another question that often comes up is, if we are already using the single tablespace format can we convert to the innodb_file_per_table option? You can convert from one to the other but it will require that you stop writes to the server, perform a complete logical backup (i.e. mysqldump) of the server, shut down the server, change the my.cnf configuration to file per table, and then restart the server.Â You will then need to drop all the current database and then reimport all your data.Â Fun!

One last benefit of using innodb_file_per_table is that defragmenting the table (using OPTIMIZE TABLE) will actually free up space in the data file, whereas right now, if you defragment an InnoDB table, the ibdata file will not shrink.

Using Per-Table Tablespaces

Jun 20, 2008
Category: MySQL
Tags: innodb, innodb_data_file_path, innodb_file_per_table
9 Responses

1.
Morgan Tocker says:
June 20, 2008 at 2:06 pm

There’s situations when either one can be a better choice.

There are more disadvantages of using file per table - you end up consuming significantly more file descriptors (which is a problem particularly for MySQL on Windows). The main InnoDB thread that wakes up every few seconds will also have to flush a lot more files when it checkpoints.

You don’t need to mysqldump the data to convert from the shared global tablespace to the file per table model (but you don’t reclaim space unless you do). A CREATE TABLE new_table LIKE old_table; INSERT INTO new_table SELECT * FROM old_table; DROP TABLE old_table; RENAME TABLE new_table to old_table; is the alternative.
2.
Gavin Towey says:
June 20, 2008 at 4:14 pm

You don’t even have to do that much to convert tables from the single tablespace to the individual files. After enabling “innodb_file_per_table” and restarting mysql just do “ALTER TABLE foo ENGINE=InnoDB” and it will rebuild the table and pull it out of the shared tablespace.

However if you ever want to reclaim all that shared tablespace then you’ll have to dump the db, shutdown mysql & delete the tablespace / log & .frm files then edit the my.cnf to re-adjust the shared tablespace file sizes and then restart and let mysql build empty files, then import your data back in.

It’s also been shown that the shared tablespace has a bit faster read/write performance.
3.
Xaprb says:
June 22, 2008 at 7:34 am

With file-per-table, the shared tablespace is still configurable as usual. It just doesn’t have as much data in it, so you can configure it to be smaller. It does have more than meta-data, though. For example: undo logs, insert buffer, doublewrite buffer, etc. Lots of “real” data — the kind of stuff that will cause lost data if you delete it.
4.
phasnox says:
July 2, 2008 at 12:15 pm

Hi, great article..
you said
“You can specify multiple file paths using multiple innodb_file_per_table options in my.cnf, and ?spread the load? over multiple partitions or hard drives. ”

how do you do that?
5.
guzik » Blog Archive » MySQL: multiple tablespaces says:
August 1, 2008 at 10:09 pm

[...] Differences Between innodb_data_file_path and innodb_file_per_table [...]
6.
Dieter says:
August 2, 2008 at 5:31 am

With MyISAM one could easily copy a database directory from one server to another and there it could be used immidialety.

Will this work with a InnoDB database that has been splitted to files too?
7.
ICD says:
August 6, 2008 at 3:27 pm

ON a version 5.0.36sp1-enterprise-gpl MySQL Enterprise Server (GPL)

I have the innodb_file_per_table set on without the innodb_data_file_path set and the
following is true when I check the global variables:
have_innodb has a value of YES
innodb_data_file_path has a value of ibdata1:10M:autoextend
innodb_file_per_table has a value of ON

this to me seems to be a bug. If there will be individual file ibd files why would I have an ibdata1 created? Secondly, anytime I try setting the innodb_data_file_path it turns off the innodb engine and have_innodb winds up having a value of NO. The error when I do set the innodb_data_file_path is
080805 7:16:57 [ERROR] InnoDB: syntax error in innodb_data_file_path

Here is the entry in the my.cnf:
datadir = /data/mysql
innodb_data_home_dir = /data/mysql
innodb_data_file_path = ibdata1:50M;ibdata2:50M:MAX:100M
innodb_file_per_table

Any ideas. I want to have the .ibd per table and not have an ibdata1 which autoextends.
8.
ILD says:
August 6, 2008 at 3:33 pm

Here is my.cnf
datadir = /data/mysql
innodb_data_home_dir = /data/mysql
innodb_data_file_path = ibdata1:50M;ibdata2:50M:MAX:100M
innodb_file_per_table

The above disables innodb.
When I have the following in my.cnf
datadir = /data/mysql
innodb_file_per_table

It correctly enables innodb and it produces individual ibd tables but it winds up with the following default attribute:

innodb_data_file_path = ibdata1:10M:autoextend

How can I turn this off and still have innodb enabled with the innodb_file_per_table turned on?

This seems to be a bug.

Here is the version I am working with
mysql Ver 14.12 Distrib 5.0.36sp1, for unknown-linux-gnu (x86_64) using readline 5.0
9.
herodiade says:
November 29, 2008 at 7:14 am

Actually, you can spread the ibdata files on different places (ie. several raid 10 arrays) just as with innodb_file_per_table.

This is just a matter of setting :
innodb_data_home_dir =
innodb_data_file_path = /d1/ibdata1:1024M;/d2/ibdata2:1024M;/d3/ibdata3:1024M:autoextend

In my opinion, the real case for innodb_file_per_table is the infamous “innodb ibdata files do not shrink” bug.

Unknown table engine ‘InnoDB’

Unknown table engine ‘InnoDB’
December 23rd, 2007

I was getting the error “ERROR 1286 (42000): Unknown table engine ‘InnoDB’” when trying to create/alter tables with the InnoDB engine. You can check for this warning by issuing the create or alter statements and then running
show warnings;
in the same mysql client session.

Check if you have InnoDB support enabled:
mysql> show variables like ‘have_innodb’; +—————+———-+ | Variable_name | Value | +—————+———-+ | have_innodb | DISABLED | +—————+———-+

The problem is that InnoDB is not enabled by default in Debian distributions of my.cnf. To enable it, simply comment the following lines under the [mysqld] section.

#skip-innodb

psexec.exe wuauclt.exe tool works intermittently

psexec.exe wuauclt.exe tool works intermittently
1isaann
unregistered posted 05-23-2006 07:50 AM Edit/Delete Message Reply w/Quote I have set up a Hyena tool that reads: C:\Program Files\Hyena\Tools\psexec.exe \\%E% wuauclt.exe /detectnow

My problem is that this works on some clients, but not others. It just flashes a command prompt screen indicating it is connecting, but then the screen closes and disappears.

Does anyone have any suggestions? I am at a loss.

Thanks,
Lisa

IP: Logged
Trammel
Member posted 05-23-2006 05:46 PM Click Here to See the Profile for Trammel Edit/Delete Message Reply w/Quote I have had this experience with some psexec trying to execute it on remote machines. I have found that by using psexec.exe to execute a batch file is more reliable. Not only that, you can stick a pause in the batch file. This will stop the cmd window from dissapearing.

Obviously in the batch file is where you put your wuauclt /detectnow command

IP: Logged
1isaann
unregistered posted 05-25-2006 06:05 AM Edit/Delete Message Reply w/Quote Thanks, Trammel. I created a batch file located at C:\Program Files\Hyena\Scripts\wuaucltdetect with the following commands:

@echo on
wuauclt /detectnow &pause
@echo off

I'm new at batch files, so I was wondering if this is all I needed. Also, when I set the tool up in Hyena, should it read like this:

C:\Program Files\Hyena\Tools\psexec.exe C:\Program Files\Hyena\Scripts\wuaucltdetect.bat

I wasn't sure if I had to list the entire path to the batch file. Is the syntax correct because it is doing the same thing.

Thanks for your help.
Lisa

IP: Logged
1isaann
unregistered posted 05-25-2006 10:44 AM Edit/Delete Message Reply w/Quote I have just tried running psexec \\computername cmd.exe from the RUN command and the same thing happens. The window closes as fast as it opens. I guess this means that it isn't my wuauclt command or Hyena. Any other thoughts as to why this is happening on most of my clients?

Thanks again,
Lisa

IP: Logged
Trammel
Member posted 05-25-2006 08:30 PM Click Here to See the Profile for Trammel Edit/Delete Message Reply w/Quote You command line should look something like this.
\\servername\scripts\HyenaTools\Pstools\psexec.exe \\%E% -u domain\username -p password -c -f "\\servername\Scripts\HyenaTools\Scripts\wuaucltdetect.bat"

The batch file is close.
@ECHO OFF
ECHO Running Wuauclt Detect
wuauclt /detectnow
ECHO Command completed
pause

I use a sharepoint for all of my tools instead of my local machine. You should be able to figure out the path difference and get yours to work. Make the batch file and then you can use a command window, paste or type in the command. This is how you can ensure the command is going to work. Once you have it working, then you can remove the pause

Hope this helps.

[This message has been edited by Trammel (edited 05-25-2006).]

IP: Logged
1isaann
unregistered posted 05-26-2006 05:04 AM Edit/Delete Message Reply w/Quote Trammel, you rock! Thanks for your help. The problem is that I was not logged on as an administrator and not adding the admin and password to the command line. Once I did that, everything worked great.

Thanks again!

IP: Logged
Trammel
Member posted 05-26-2006 11:08 AM Click Here to See the Profile for Trammel Edit/Delete Message Reply w/Quote Another satisifed Hyena user

Wait, where's my commission Kevin

[This message has been edited by Trammel (edited 05-26-2006).]

IP: Logged
rtmcmullen
Junior Member posted 09-19-2006 12:18 PM Click Here to See the Profile for rtmcmullen Edit/Delete Message Reply w/Quote Is there anyway to have this dump into a log file for a status?

IP: Logged
Trammel
Member posted 09-19-2006 06:00 PM Click Here to See the Profile for Trammel Edit/Delete Message Reply w/Quote You could add a command in the batch file to append a text file. Add this line to your batch file and point it to a network share instead of C drive.

echo %date% , %time% , %computername% , %username% , >> c:\echotest.txt

Default Javascript to the footer for better performance

What do you mean with:

Default Javascript to the footer for better performance

I am interested but could not find any releated info on this matter.
Do you have any Drupal url's where this is discussed?

===

Resources loaded within the head tags are loaded in sequence before any page rendering is performed, hense slowing down the 'frontend' load time.

Friday, April 24, 2009

職業貴賤觀念，害慘台灣軟體業

職業貴賤觀念，害慘台灣軟體業
孔夫子用職業定人貴賤，滿可惡地，餘毒裊裊，至今不絕。

當然士農工商現在定義變了不少，商人如今地位很高，農人變低了。士是做官，但在私人公司裡當主管也算是。

總之，以職業定貴賤的習俗沒變。所以你常看到，做父母的總問孩子：「你有沒有當上主管?」，當上才方便他向親友吹噓。女性徵婚，條件苛刻一些的，也會寫上要是主管。老公是主管才有面子。

軟體這行業，十分需要技術紮實的工程師，但他們在台灣卻命運多舛。

首先，長輩親戚問你職業，「工程師？一直做這沒出息喔，要當個主管才行。」於是人人想當主管。

好主管需要在基層紮實鍛鍊過，這樣他才能管理，知道合理的預算與工作時間，知道員工是否在撒謊。此外，也需要靈活的人際手腕，才能與其它部門溝通協調、激勵士氣，更需要敏銳的識人能力，讓優秀員工心甘情願為他工作。

這麼多嚴苛條件，顯然不是人人都適合當主管。而且人的腦容量有限，所以有些自閉傾向的人，不擅與人交際，反而能夠處理各種超級複雜的程式、難題；長袖善舞、善於洞察人心的人，又往往怕寫程式。最適合安排他們的方式，就是讓工程師專心寫程式，不要碰管理；管理讓適合的人去做。德國早就發現此一問題，因此讓科學家能專心研發，不用做管理，只要能做出好產品，照樣領高薪升高職，日本也有達人制度，對技藝超群的工匠非常尊敬。因此德日兩國產品品質也是有口碑地。

台灣呢，會寫些簡單程式就急著往上跳。想做系統分析的人一大堆，總以為嘴巴能屁就能分析，還可管工程師；想寫程式設計一點點，而且心不甘情不願。

因為寫程式的基礎不夠，所以做系統分析與管理也搞得一塌糊塗。常有人暢言不擅寫程式沒關係，丟給工程師就好。但腦力不足寫程式的人，遇到複雜企業流程，環環相扣的物流與財務關係，用戶因私心而有所保留的閃爍說詞，馬上兵敗如山倒。幸運的客戶即早翻臉，勒令該員走人；不幸客戶就花了驚人的金錢，買了邏輯不通，根本不能用的系統。

寫程式本來是像玩電玩解謎一般，充滿樂趣。解決問題寫出功能，本身就能帶給工程師極大快樂。但碰到無能主管，就亂定時程，做到是應該的，做不到該責罰，把快樂工作變成痛苦煉獄。系統分析又不時來與工程師鬧，有功是系統分析得當，有過是工程師不好好寫，光內鬥就不知虛耗多少。以前遇過一個老闆，他乾脆逼工程師出去跑客戶，因為他認為交際手腕是可以訓練的，自己分析規畫寫程式，快速方便又快樂。只有一個自閉傾向工程師不用跑，但他寫的超級程式，可是公司賺錢利器。

寫這麼多，只是覺得與其罵台灣不如德日，做不出好產品，不如先改變自己態度，不要老以職業定人尊卑，對技藝超群的工匠、工程師多些肯定，才是根本解決之道啊！

可以搭配這篇討論～～

為何老闆總是吹牛專家 - 摘自TIME時代雜誌

嘴砲容易升主管，當上主管也自然認為嘴砲是人才～
沒辦法，現實社會就是如此。

Thursday, April 23, 2009

Modulus With Bitwise Masks

Modulus With Bitwise Masks

Sometimes it is necessary to keep a variable with in a certain range. ex. 0 - 255, 0 - 127, etc. I will show you an easy way to do modulo base 2 numbers with bitwise masking.

First off, a little background on what modulus does. Modulus finds the remainder after division of two numbers. So the result of a % b will always be in between 0 and one less than b.

Secondly, you need to know how the bitwise AND, &, works. It will compare the corresponding bits in each term, and only set the bit in the result if both bits in each term are set.
CODE


    0101
AND 0011
  = 0001

Now that we understand modulus and binary AND lets combine the two concepts.

If the result of the modulus will always be atleast one less than the value you are dividing by then this leaves us to be able to using a bitwise mask to perform the same operation on a base 2 number. First off a bitwise mask is just a row of consecutive bits used to mask certain bits in a byte. (00001111) The masks can be located anywhere inside the byte(s) and be of any length, but for the purposes of modulo they will always start at bit 0.

Now if we where to do a modulo of 8, the result would be in the range of 0-7, 0000 - 0111 in binary. That looks like a mask right? It is and this is how we will perform the modulo. We will set up a mask of the number - 1.
CODE


16 - 15 (0x0F)
32 - 31 (0x1F)
64 - 63 (0x3F)
128 - 127 (0x7F)
256 - 255 (0xFF)

Simple example of how this works.
CODE


45 % 32
45 & 31
45 & 0x1F

    00101101 (45)
AND 00011111 (31)
  = 00001101 (13)
45 % 32 = 13

By masking of only the bits we need we can come up with a result that is within the range we need.

Now how can this be implemented.
CODE


//Lazy way
for(int i = 0, j = 0; i < 256; i++)
{
    if(j % 16 == 0) j = 0;
    else j++;
}

//Better way
for(int i = 0, j = 0; i < 256; i++)
{
    j++;
    j %= 16;
}

//Combine it into one line
for(int i = 0, j = 0; i < 256; i++)
{
    ++j %= 16;
}

//Combined using binary mask
for(int i = 0, j = 0; i < 256; i++)
{
    ++j &= 0xF;
}

By using binary operations you will improve the execution times of your algorithms significantly, as the processor can deal directly with the bits much faster than doing the division out.

Good luck and happy coding.

如何學好外語

學文法不如用聽的. 聽順了考試時試著說說看. 哪種說法比較順一般來說就是正確答案了.

學語文順序應該是聽->說->讀->寫.
不過台灣大部分的教法要求速成都是讀寫聽說...學得快忘得也快. 而且除了讀寫ok以外, 根本沒辦法真的面對面要溝通

哪套軟體可以把影片轉成MP3 我用kmplayer轉的會爆音

哪套軟體可以把影片轉成MP3 我用kmplayer轉的會爆音

Easy Video to Audio Converter

再請問有軟體可以直接將 DVD 的音軌轉成MP3嗎

應該有很多套軟體~

DVDFab輸出檔案可以單選音訊~

投資心得

投資心得:
第一步是減少浪費.第二步是改正惡習
第三步是買份終身保險.第四步是預先儲蓄保險金和所得稅
第五步是閱讀經濟學.會計學.財務學.貨幣銀行.國際金融(限大專用書)
第六步是在工作上求上進

Wednesday, April 22, 2009

INSERT UPDATE larget many file data rows for InnoDB

<?php
echo date('Y-m-d h:i:s') . '\n';

include('./common.php');

ini_set('max_execution_time', 0); 

$conn = new mysqli($db_server, $db_user, $db_pass);

/* check connection */
if ($conn->connect_errno) {
   printf("Connect failed: %s\n", $conn->connect_error);
   exit();
}

/* change db to your db */
$conn->select_db($db_name['danny_testing']);

mb_internal_encoding("UTF-8");
$conn->query("SET NAMES 'utf8'");
$conn->query("SET CHARACTER SET 'utf8'");

// ### SET AUTOCOMMIT to 0 (for speeding up the process)
$conn->query("SET autocommit=0;");

$sql = "SELECT id, name FROM member";
$rs = $conn->query($sql);
while ($row = $rs->fetch_array()) {
  $conn->query("UPDATE member SET name = '" . $row['name'] . "_new' WHERE id = " . $row['id']);

  // ### COMMIT (save) the data every 10000 rows
  if ($row % 10000 == 0) {
    $conn->query('COMMIT;');
  }
}

// ### COMMIT (save) the rest of rows.
$conn->query('COMMIT;');

fclose($handle);

echo 'Done!';

$conn->close();

echo date('Y-m-d h:i:s') . '\n';
?>

Remove the 'Shortcut to...' Prefix on Shortcuts (All Windows) Popular

Remove the 'Shortcut to...' Prefix on Shortcuts (All Windows) Popular

Don't like having 'Shortcut to...' appended to every Shortcut? You're not alone. With this tip you can stop Windows for adding this text when creating links.
This tweak can be easily applied using WinGuides Tweak Manager.
Download a free trial now!

Open your registry and find the key below.

Create a new BINARY value name 'link', of modify the existing value, to equal '00 00 00 00' to have the Shortcut text disabled.

Restart Windows before creating any new shortcuts.

Note: This restriction can be used either on a user by user basis by adding it to HKEY_CURRENT_USER or on a computer wide basis by adding it to HKEY_LOCAL_MACHINE.
Registry Editor Example
| Name Type Data |
| (Default) REG_SZ (value not set) |
| link REG_BINARY 00 00 00 00 |
-
| HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\... |
-
Registry Settings
User Key: [HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer]
System Key: [HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Explorer]
Value Name: link
Data Type: REG_BINARY (Binary Value)
Value Data: (00 00 00 00 = No Shortcut Text)

Tuesday, April 21, 2009

FreeBSD MS SQL note

依照 php 的手冊的說法, 在 MS平台上的 php 要讀 MSSQL 只要從 SQL Server
上copy 一 DLL檔到 php 平台的 C:%sysroot% 這樣就可以了, sure, 要在php.ini
裡把相關的 extension 給 unmark掉!

但要在 Linux/FreeBSD/UNIX平台要從 php 讀 MSSQL 就得要透過 FreeTDS 來跑了!

我抓了 freetds-0.62.4.tar.gz 下來!

1.首先要把 freetds給compile & install
我下了 configure 的參數為 :
./configure –prefix=/usr/local/freetds –with-tdsver=8.0 –enable-msdblib

2.重新compile php
在 configure 時要加上一個參數 –with-mssql=/usr/local/freetds

接下來就可以直接用 php 內建連 sql server 的函式玩了!
我連的是 sql server 2000

============

# make TDS_VER=8.0 install

# tsql -C
Compile-time settings (established with the "configure" script):
Version: freetds v0.64
MS db-lib source compatibility: no
Sybase binary compatibility: unknown
Thread safety: yes
iconv library: yes
TDS version: 8.0
iODBC: no
unixodbc: yes

# tsql -S 192.168.100.1 -U danny

============
For all of you connecting to MS SQL Server using freeTDS, having trouble to log on with a domain login:

Add the line:
putenv('TDSVER=70');

just before calling mssql_connect()

This worked for me with a MS SQL Server 2K Ent.

===========
FreeTDS is work!!!

But after install you shoul add in config file
1. Add in freetds.conf client cahrset:
[TDS_NAME_TO_CONNECT]
host = 192.168.0.1
port = 1433
tds version = 8.0
client charset = UTF-8
;client charset = CP1251

2. For datetime add in locales.conf in [default]:
date format = %Y-%m-%d %I:%M:%S.%z

3. restart apache

4. In your php-script you should connect to [TDS_NAME_TO_CONNECT]
mssql_connect('TDS_NAME_TO_CONNECT', 'php', '123');
=======

This took me a while to figure out, so I may as well pass it along to save somebody time.

If you are having trouble connecting with a DSNless connection using unixodbc with freetds, don't forget to specify the TDS_Version in your connection string.

Example (8.0 is for SQL Server 2000 exclusively):

$dsn = "DRIVER=FreeTDS;TDS_Version=8.0; TRACE=No; SERVER=someserver; PORT=someport; DATABASE=somedb; UID=somelogin; PWD=somepassword";
=====
Couple of notes:
When setting this up, you might notice that the unixODBC isql command likes the password wrapped in single quotes:
isql -v MyDSN MyUserID 'MyPa$$W0rd'

Additionally, if you happen to have a dollar-sign in your password (or username, or DSN) -- you must avoid using double quotes. This is a normal PHP gotcha, but worth mentioning.
Won't work:
$con = mssql_connect ("MyDSN", "MyUserID", "MyPa$$W0rd");

Will work (single quotes):
$con = mssql_connect ("MyDSN", "MyUserID", 'MyPa$$W0rd');

http://www.hoyo.idv.tw/hoyoweb/document/436.html
====================

PDO_DBLIB

cd /usr/ports/databases/php5-pdo_dblib

====================
ADOdb Database Abstraction Library for PHP (and Python

======

ports 裡面裝 php4-extensions OR php5-extensions 的時候把 "sybase_ct" 選起來 FreeTDS 就會自動幫你裝上啦...
你要設定的是

PHP 手冊有一段

PDO_DBLIB DSN — Connecting to Microsoft SQL Server and Sybase databases

DSN prefix
The DSN prefix is

sybase: if PDO_DBLIB was linked against the FreeTDS libraries

mssql: if PDO_DBLIB was linked against the Microsoft SQL Server libraries

dblib: if linked against any other variety of DB-lib.

但上面這段似乎有些怪怪的,,, phpinfo() 裡面 pdo_dblib 有 freetds enabled,,

php5-sybase_ct-5.2.8 The sybase_ct shared extension for php

foreach(PDO::getAvailableDrivers() as $driver)
{
echo $driver.'
';
}

照手冊上的敘述，我應該要用
new PDO ("sybase:host=$hostname;dbname=$dbname",$username,$pw)
但卻說找不到 driver

接著我用
new PDO ("dblib:host=$hostname;dbname=$dbname",$username,$pw)

可以連，但是相當的慢 (光連線到 DB 就要 15 秒! 還沒有做 select 的動作喔)

後來，關鍵在這個 BSD port:
php5-sybase_ct-5.2.8 The sybase_ct shared extension for php

安裝以後，用 dblib 來連，速度就正常了，但用 sybase 仍然說 driver 找不到 :P

沒裝速度相當慢,, 裝了就正常

( 一開始還搞不清楚幾個東西之間的關係,,, 例如 FreeTDS, php-mssql, sybase, pdo-dblib, etc)

然後分別用 new pdo() 和 mssql_connect() 做 insert 的測試,,, new pdo() 的分式快一些些

===========
$dbh = new PDO ("dblib:host=$hostname;dbname=$dbname;charset=UTF-8",$username,$pw );
===========

lock request time out period exceeded

http://ca2.php.net/manual/en/ref.pdo-dblib.connection.php

http://www.phpro.org/tutorials/Introduction-to-PHP-PDO.html

http://swik.net/MySQL/MySQL+vs+MS+SQL+Server

http://192.168.184.130/test3.php

http://www.developersdex.com/sql/message.asp?p=581&r=6535941

http://social.msdn.microsoft.com/Forums/en-US/vstsdb/thread/fa474346-3063-4b79-8830-98411029bdc0/

http://www.sql-server-performance.com/tips/blocking_p1.aspx

SET LOCK_TIMEOUT -1

=============================

include("adodb5/adodb.inc.php");
//odbc_connect("DSN=MyServer2k","db_user","db_pass");

//$db_connection = new COM("ADODB.Connection");

$db = &ADONewConnection("mssql");
$db->Connect('MyServer2k', 'db_user', 'db_pass', 'danny_testing');
$rs = $db->Execute('select top 10 * from my_member2');
print_r( $rs );

====================================

$conn = mssql_connect("MyServer2k", "db_user", "db_pass" );
mssql_select_db('danny_testing', $conn);

=======================
$conn = new COM ("ADODB.Connection", NULL, CP_UTF8)

=======================

$dsn = "odbc:Driver={SQL Server};Server=192.168.200.119;Database=db_name;Uid=db_user;Pwd=db_pass;charset=UTF-8";
//$conn = new PDO ("odbc:odbc_connection_name;charset=UTF-8","db_user","db_pass");

$conn = new PDO($dsn, "db_user", "db_pass", null);

$query = "SELECT TOP 10 id_report, title FROM nai_news";

$sth = $conn->prepare( $query );
$sth->execute();

/* Exercise PDOStatement::fetch styles */

while( $result = $sth->fetch(PDO::FETCH_ASSOC) ) {
//echo $result['id_report']." ".$result['title']."
";
//$block_content .= l($links->title, 'node/'.$links->id_report) . '
';
print_r($result);
//echo $result->id_report;
}

unset($conn);

DBConvert - DBSync for MS SQL & MySQL

DBSync for MS SQL & MySQL

DBConvert Product Line is a constantly growing range of efficient tools strongly focused on data migration between different Database Management Platforms.
It includes a number of powerful converters and synchronizers that perform accurate and convenient way of data transfer from one server to another.
DBConvert tools handle the process of transformation and replication data between the most of the leading database engines MS Access, MS SQL, MS FoxPro, MySQL, PostgreSQL and others.

Monday, April 20, 2009

danny_library.php

<?php
/**
 * 
 * @param
 * @return the last day of the month
*/
function _getLastDayOfMonth($year, $month) {
  $lastDay = idate('d', mktime(0, 0, 0, ((int)$month + 1), 0, (int)$year));
  return $year . '-' . $month . '-' . $lastDay;
}

/**
 * This function will usually be used with convert_smart_quotes() function.
 * @param 
 * @return 
 */
function str_convert_cp1252_utf8($str) {
  $strEncode = mb_detect_encoding($str, 'UTF-8, ISO-8859-1');
  if ($strEncode == 'ISO-8859-1') {
    $str = mb_convert_encoding($str, 'UTF-8', 'CP1252');
  }

  return $str;
}

function convert_smart_quotes($string) { 
//    $search = array(chr(145), 
//                    chr(146), 
//                    chr(147), 
//                    chr(148), 
//                    chr(151)); 
  $search = array(
    '‘',
    '’',
    '“',
    '”',
    '–',
  );
  $replace = array(
    "'", 
    "'", 
    '"', 
    '"', 
    '-',
  ); 

  return str_replace($search, $replace, $string); 
} 

/**
 * This function checks for dates in a string. 
 * The string MUST be in one of 2 formats listed below in the examples.
 * @param 
 * @return 
 * is_date('2008-02-01');   //Returns TRUE
 * is_date('ssss 2008-02-01');  //Returns FALSE
 * is_date('2008-02-01 02:33:10')  //Returns TRUE
 */
function is_date($date) {
  $len = strlen($date);
  if ($len <= 10) {
    return preg_match('/^[0-9]{4,4}-[0-9]{1,2}-[0-9]{1,2}$/', $date);
  }
  elseif ($len > 10 && $len <= 19) {
    return preg_match('/^[0-9]{4,4}-[0-9]{1,2}-[0-9]{1,2} [0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}$/', $date);
  }
  else {
    return 0;
  }
}

/**
 * Parse a CSV string into an array ().
 * The build-in fgetcsv() function required you to specify a file pointer. 
 * This function used memory instead of a file pointer.
 * @param 
 * @return 
 */
function str_getcsv_mem($input, $delimiter = ",", $enclosure = '"', $escape = "\\") {
  $outputArr = Null;

  $oneHundredKBs = 1 * 100 * 1024;
  $fp = fopen('php://temp/maxmemory:' . $oneHundredKBs, 'r+');
  fputs($fp, $input);
  rewind($fp);

  $rowCount = 0;
  while (($row = fgetcsv($fp, 1000, $delimiter, $enclosure)) !== FALSE) { //$escape only got added in 5.3.0
    foreach ($row as $key => $col) {
      $outputArr[$rowCount][$key] = $col;
    }

    $rowCount++;
  }

  fclose($fp);

  return $outputArr;
}

/**
 * eat up multiple spaces (replace space) to single space
 * @param
 * @return
*/
function _eatSpaces($str) {
  return ereg_replace( ' +', ' ', $str ); 
}

/**
 * Split a string to an array by an array of positions.
 * @param
 * @return
 * 
 * Sample:
 * $pos = array(0, 1, 3, 6, 10);
 * $str = "abbcccddddeeeee";
 * $arr = split_str_by_pos($str, $pos);
*/
function split_str_by_pos($str, $pos) {
  $arr = array();

  $count = count($pos);
  $leng = strlen($str);

  for ($i=0;$i<$count;$i++) {
    $start = $pos[$i];
    $end = ($i == $count - 1) ? $leng : $pos[$i+1];

    $fieldVal = '';

    for ($j = $start; $j < $end; $j++) {
      $fieldVal .= $str[$j];
    }

    $arr[] = $fieldVal;
  }

  return $arr;
}

/**
 * Return duplicate values.
 * @param 
 * @return 
 */
function array_duplicates($arr) {  
  if (!is_array($arr)) {  
    return false;  
  }  
  
  $duplicates = array();  
  
  sort($arr);  
  
  $count = count($arr) - 1; // ### last index won't have duplicate.  
  for ($i = 0; $i < $count; $i++) {  
    if ($arr[$i] == $arr[$i+1]) {  
      $duplicates[] = $arr[$i];  
      $i++; // ### Skip next index.  
    }  
  }  
  
  return $duplicates;  
} 

/**
 * Check if a number is a odd number. 單數
 * @param 
 * @return help text for the path
 */
function is_odd($var)
{
   return($var & 1);
}

/**
 * Check if a number is a even number. 複數
 * @param 
 * @return help text for the path
 */
function is_even($var)
{
   return(!($var & 1));
}


/**
 * Generate a random alphanumeric password.
 */
function random_password($length = 10) {
  // This variable contains the list of allowable characters for the
  // password. Note that the number 0 and the letter 'O' have been
  // removed to avoid confusion between the two. The same is true
  // of 'I', 1, and 'l'.
  $allowable_characters = 'abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ23456789';

  // Zero-based count of characters in the allowable list:
  $len = strlen($allowable_characters) - 1;

  // Declare the password as a blank string.
  $pass = '';

  // Loop the number of times specified by $length.
  for ($i = 0; $i < $length; $i++) {

    // Each iteration, pick a random character from the
    // allowable string and append it to the password:
    $pass .= $allowable_characters[mt_rand(0, $len)];
  }

  return $pass;
}

/**
 * The built-in in_array was not what I wanted. So, I make this function.
 * @param 
 * @return help text for the path
 */
function in_array2($searchArr, $beSearchedArr) {
  $searchArr = is_array($searchArr) ? $searchArr : array($searchArr);

  foreach ($searchArr as $search) {
    foreach ($beSearchedArr as $beSearched) {
      if (strcmp($search, $beSearched) == 0) {
        return true;
      }
    }
  }

  return false;
}

/**
 * 
 * @param $fieldArr = array('field_name' => 'field_value');
 * @return help text for the path
 */
function sql_rewrite_insert($tableName, $fieldArr) {
  $fieldStr = '';
  $fieldStr2 = '';

  foreach ($fieldArr as $key => $val) {
    $fieldStr .= $key . ',';
    $fieldStr2 .= is_null($val) ? 'NULL,' : "'" . $val . "',";
  }

  // ### get rid of the last comma.
  $fieldStr = mb_substr($fieldStr, 0, mb_strlen($fieldStr) - 1, 'UTF-8');
  $fieldStr2 = mb_substr($fieldStr2, 0, mb_strlen($fieldStr2) - 1, 'UTF-8');

  $output = 'INSERT INTO ' . $tableName . ' (' . $fieldStr . ') VALUES (' . $fieldStr2 . ');';

  return $output;
}

/**
 * 
 * @param $fieldArr = array('field_name' => 'field_value');
 * @param $whereFieldArr = array('field_name' => 'field_value');
 * @return help text for the path
 */
function sql_rewrite_update($tableName, $fieldArr, $whereFieldArr = Null) {
  $fieldStr = '';

  foreach ($fieldArr as $key => $val) {
    $fieldStr .= $key . " = '" . $val . "',";
  }

  $fieldStr = mb_substr($fieldStr, 0, -1, 'UTF-8');

  $sql = 'UPDATE ' . $tableName . ' ';
  $sql .= 'SET ';
  $sql .= $fieldStr . ' ';
  $sql .= "WHERE 1=1 ";

  if (is_array($whereFieldArr)) {
    foreach ($whereFieldArr as $key => $val) {
      $sql .= "AND " . $key . " = '" . $val . "' ";
    }
  }

  return $sql;
}

/**
 * Write data to a file.
 * @param $silence: to output warning message or not.
 * @return help text for the path
 */
function writeData($fileName, $content, $mode, $silence = false) {
  // Let's make sure the file exists and is writable first.
  if (is_writable($fileName)) {

    // In our example we're opening $filename in append mode.
    // The file pointer is at the bottom of the file hence
    // that's where $somecontent will go when we fwrite() it.
    if (!$handle = fopen($fileName, $mode)) {
        echo 'Cannot open file (' . $fileName . ')';
        exit;
    }

    // Write $somecontent to our opened file.
    if (fwrite($handle, $content) === FALSE) {
       if (!$silence) {
         echo 'Cannot write to file (' . $fileName . ')';
       }
       exit;
    }

    if (!$silence) {
     echo 'Success, wrote content to file (' . $fileName . ')';
    }

    fclose($handle);

  }
  else {
    if (!$silence) {
     echo 'The file '. $fileName . ' is not writable';
    }
  }
}

function _getExchangeID2( $exchgCodeL ) {
  ### V       TSXV    1 
  ### T       TSX     2
  ### N       NYSE    3
  ### Q,QM,QS NASDAQ  4
  ### A       AMEX    5
  ### QB      OTCBB   8
  switch (strtoupper($exchgCodeL)) {
    case 'TSXV':
      $exchg = '1';
      break;
    case 'TSX':
      $exchg = '2';
      break;
    case 'NYSE':
      $exchg = '3';
      break;
    case 'NASDAQ':
      $exchg = '4';
      break;
    case 'AMEX':
      $exchg = '5';
      break;
    case 'OTCBB':
      $exchg = '8';
      break;
    case 'OTCBB PK':
      $exchg = '16';
      break;
    case 'CNQ':
      $exchg = '17';
      break;
    default:
      $exchg = 'All Exchanges';
  }

  return $exchg;
}

function _getExchangeCodeL2( $exchgID ) {
  ### V       TSXV    1 
  ### T       TSX     2
  ### N       NYSE    3
  ### Q,QM,QS NASDAQ  4
  ### A       AMEX    5
  ### QB      OTCBB   8
  switch ($exchgID) {
    case 1:
      $exchg = 'TSXV';
      break;
    case 2:
      $exchg = 'TSX';
      break;
    case 3:
      $exchg = 'NYSE';
      break;
    case 4:
      $exchg = 'NASDAQ';
      break;
    case 5:
      $exchg = 'AMEX';
      break;
    case 8:
      $exchg = 'OTCBB';
      break;
    case 16:
      $exchg = 'OTCBB PK';
      break;
    case 17:
      $exchg = 'CNQ';
      break;
    default:
      $exchg = 'All Exchanges';
  }

  return $exchg;
}

/**
 * @param $stockChange = array('up' => '', 'down' => '', 'nochange' => '');
 * @param $outputFormat can be 'file', 'browser'.
 * @return 
 */
function stock_change_chart($stockChange = NULL, $fileName, $outputFormat = 'file') {

  if (is_array($stockChange)) {
    
    $total = $stockChange['up'] + $stockChange['down'] + $stockChange['nochange'];
    $aa = 0;
    $bb = ($stockChange['up'] / $total) * 360;
    $cc = $bb + ($stockChange['down'] / $total) * 360;
    $dd = 360;

    // create image
    $image = imagecreatetruecolor(100, 100);

    // ### For *transparent* PNG image
    imagesavealpha($image, true);
    $trans_colour = imagecolorallocatealpha($image, 0, 0, 0, 127);
    imagefill($image, 0, 0, $trans_colour);

    // allocate some solors
    $white    = imagecolorallocate($image, 0xFF, 0xFF, 0xFF);
    $gray     = imagecolorallocate($image, 0xCC, 0xCC, 0xCC);
    $darkgray = imagecolorallocate($image, 0x90, 0x90, 0x90);
    $navy     = imagecolorallocate($image, 0x00, 0x00, 0x80);
    $darknavy = imagecolorallocate($image, 0x00, 0x00, 0x50);
    $red      = imagecolorallocate($image, 0xFF, 0x00, 0x00);
    $darkred  = imagecolorallocate($image, 0x90, 0x00, 0x00);
    $green    = imagecolorallocate($image, 0x33, 0x99, 0x00);
    $darkgreen    = imagecolorallocate($image, 0x33, 0x66, 0x00);

    // make the 3D effect
    for ($i = 60; $i > 50; $i--) {
      imagefilledarc($image, 50, $i, 100, 50, $aa, $bb, $darkgreen, IMG_ARC_PIE);
      imagefilledarc($image, 50, $i, 100, 50, $bb, $cc , $darkred, IMG_ARC_PIE);
      imagefilledarc($image, 50, $i, 100, 50, $cc, $dd , $darkgray, IMG_ARC_PIE);
    }

    imagefilledarc($image, 50, 50, 100, 50, $aa, $bb, $green, IMG_ARC_PIE);
    imagefilledarc($image, 50, 50, 100, 50, $bb, $cc , $red, IMG_ARC_PIE);
    imagefilledarc($image, 50, 50, 100, 50, $cc, $dd , $gray, IMG_ARC_PIE);

    ### flush image

    // if you want to output to browser, uncomment following line, and remove the file name from next line.
    //header('Content-type: image/png');
    imagepng($image, $fileName);

    imagedestroy($image);
  }
}
?>

adminpack Windows Server 2003 Service Pack 2 Administration Tools Pack for x86 editions

Windows Server 2003 Service Pack 2 Administration Tools Pack for x86 editions

included: remote desktop connection (better one)

http://www.microsoft.com/downloads/details.aspx?familyid=86B71A4F-4122-44AF-BE79-3F101E533D95&displaylang=en

Sunday, April 19, 2009

Debugger 除錯器, Disassembling, Decompilation, reverse engineering

Debugger 除錯器, Disassembling, Decompilation, reverse engineering
OllyDebug
Syser
Sourcer
SoftIce

changing varchar to nvarchar - Erland Sommarskog

changing varchar to nvarchar - Erland Sommarskog
11-Nov-07 10:24:28

Better Article - change database columns from varchar to nvarchar if not already nvarchar

Hans - DiaGraphIT - (HansDiaGraphIT@) writes:

Run any of these SELECT:


-- SQL 2005
SELECT 'ALTER TABLE ' + quotename(o.name) +
' ALTER COLUMN ' + quotename(c.name) +
' nvarchar(' + ltrim(str(c.max_length)) + ') ' +
CASE c.is_nullable
WHEN 1 THEN ' NULL'
WHEN 0 THEN ' NOT NULL'
END
FROM  sys.objects o
JOIN  sys.columns c ON o.object_id = c.object_id
JOIN  sys.types t ON c.system_type_id = t.system_type_id
WHERE t.name = 'varchar'
ORDER BY o.name, c.name

-- SQL 2000
SELECT 'ALTER TABLE ' + quotename(o.name) +
' ALTER COLUMN ' + quotename(c.name) +
' nvarchar(' + ltrim(str(c.length)) + ') ' +
CASE columnproperty(o.id, c.name, 'AllowsNull')
WHEN 1 THEN ' NULL'
WHEN 0 THEN ' NOT NULL'
END
FROM  sysobjects o
JOIN  syscolumns c ON o.id = c.id
JOIN  systypes t ON c.xtype = t.xtype
WHERE t.name = 'varchar'
AND o.xtype = 'U'
ORDER BY o.name, c.name

Copy and past the result. However, there is a reason that I recommended
rebuilding from scripts. ALTER TABLE only takes one ALTER COLUMN, so
if there are several varchar columns in the same table, SQL Server
have to rebuild the tables several times. Since nvarchar takes up twice
the space of varchar, this is no simple metadata change. For large
tables, this could take a long time. Building a new database may be
faster route.

But maybe some of the columns are indexed? In such case, you need to
drop these indexes first.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

change database columns from varchar to nvarchar if not already nvarchar

change database columns from varchar to nvarchar if not already nvarchar

Hello, I am in a situation where I must update an existing database structure from varchar to nvarchar using a script. Since this script is run everytime a configuration application is run, I would rather determine if a column has already been changed to nvarchar and not perform an alter on the table. The databases which I must support are SQL Server 2000, 2005 and 2008.

The following query should get you what you need:


IF EXISTS   (SELECT *   FROM sysobjects syo   JOIN syscolumns syc ON     syc.id = syo.id   JOIN systypes syt ON     syt.xtype = syc.xtype   WHERE      syt.name = 'nvarchar' AND     syo.name = 'MY TABLE NAME' AND     syc.name = 'MY COLUMN NAME')BEGIN   ALTER ...END

You can run the following script which will give you a set of ALTER commands:


SELECT 
'ALTER TABLE ' + isnull(schema_name(syo.id), 'dbo') + '.' + syo.name + 
' ALTER COLUMN ' + syc.name + 
' NVARCHAR(' + case when syc.length >= 4000 Or syc.length = -1 then 'MAX' ELSE convert(nvarchar(10),syc.length) end + ');' 
FROM sysobjects syo 
JOIN syscolumns syc ON syc.id = syo.id 
JOIN systypes syt ON syt.xtype = syc.xtype 
WHERE syt.name = 'varchar' AND syo.xtype='U'

-- Note: You will need another script for converting text to ntext !!!!

There are, however, a couple of quick caveats for you.

1. This will only do tables. You'll want to scan all of your sprocs and functions to make sure they are changed to NVARCHAR as well.

2. If you have a VARCHAR > 4000 you will need to modify it to be NVARCHAR(MAX)
But those should be easily doable with this template.

If you want this to run automagically you can set it in a WHILE clause.

======================================

when you alter the field from varchar to nvarchar, you might encounter following error message:

Cannot alter or drop column 'title' because it is enabled for Full-Text Search

Use following script to select all the fields that have full text index enabled.


select 
c.name as 'Table',b.name as 'Field'
--c.*, a.*, b.*
from sys.fulltext_index_columns a 
join sys.columns b
on a.object_id=b.object_id
and a.column_id=b.column_id
join sys.objects c
on a.object_id=c.object_id

Alter table script: This script is for generating disable full text search scripts.


select 
'ALTER FULLTEXT INDEX ON dbo.' + c.name + ' DROP (' +  b.name + ') WITH  NO POPULATION;'
--c.*, a.*, b.*
from sys.fulltext_index_columns a 
join sys.columns b
on a.object_id=b.object_id
and a.column_id=b.column_id
join sys.objects c
on a.object_id=c.object_id

SQLyog Sets New Standards for Data Synchronization Speed

SQLyog Sets New Standards for Data Synchronization Speed.
peter_laursen June 21st 2008 Posted to MySQL, SQLyog
In this BLOG post I will concentrate on the achievements with data synchronization with SQLyog 7.0 (beta 1 released just before the weekend).

SQLyog 7 provides 2-8 times speed improvement (depending on data) with a ‘mixed-type’ of syc-job (involving both INSERTS, UPDATES and DELETES) as compared to SQLyog 6. A few comparison examples (SQLyog 6, SQLyog 7 and Maatkit) with 4 testcases:

a)
Source Rows: 3950400, Target Rows: 3950400, Inserted Rows: 49599, Updated Rows: 49500, Deleted Rows:49599. Primary Col(INT). InnoDB
SQLyog 6: 1120 sec
SQLyog 7: 267 sec
Maatkit : 530 sec

b)
Source Rows: 48025 , Target Rows: 48775, Inserted Rows: 1225 , Updated Rows:1225, Deleted Rows : 1975. Primary Col(INT), InnoDB.
SQLyog 6: 30 sec
SQLyog 7 : 8 sec
Maatkit : 19 sec

c)
Source Rows:150404 , Target Rows: 152864, Inserted Rows: 12136, Updated Rows: 16236, Deleted Rows : 14596. Primary Cols(VARCHAR, CHAR). InnoDB.
SQLyog 6: 320 sec
SQlyog 7 : 70 sec
Maatkit : Maatkit did not finish after 20 minutes. We cannot tell if it ever will with this example.

d)
Source Rows: 18209, Target Rows: 10000, Inserted: 9018, Updated Rows: 1001, Deleted Rows 809. Primary cols(SMALLINT, BIGINT). MyISAM.
SQLyog 6: 32 sec
SQLyog 7: 8 sec
Maatkit : 24 sec

(all those examples use a ‘developer machine’ configuration as defined by the configuration wizard bundled with the windows installer for the MySQL server 5.0.51b. Sync is between two databases on the same server. Client and server running on the same machine - a 3Ghz Intel Pentium4 with 1G RAM Running Windows XP SP2 - and with no other significant load at the time of sync. Maatkit version is 1877. Perl environment for running Maatkit was created with ActivePerl for Windows).

We would like to credit Baron ‘Xaprb’ Schwartz (author of Maatkit) for the Maatkit algorithms though (this post in particular was a challenge for us) from which we learned a lot. For the most typical PK-setup (a single column integer PK) the somewhat better performance of SQLyog 7 as compared to Maatkit is probably only due to the fact that SQLyog/SJA is a multithreaded compiled (C++) binary and not a server-side (Perl) script. However with other (more unusual) PK setups the difference is bigger to the advantage of SQLyog.

When syncing to an empty table we use a special high-speed codebrach (a rather simple copy, actually). Other sync tools (also Maatkit) waste lot of time looping while testing things that really need not be tested in this situation! SQLyog will be even faster the larger the max_allowed_packet setting in server configuration as this setting will be detected and BULK INSERTS as large as possible will be generated (optionally). With the above test SQLyog was ~50 times faster than Maatkit when syncing to an empty target (but with the 2 largest cases we interrupted Maatkit before it had finished, though!)

You can download the four testcases if you want to verify/test on your environment. And if you think it is not fair to compare with Maatkit on WIndows, you can do the same on Linux with the SJA (SQLyog Job Agent) for Linux as well.

Download links (zip archives with SQL dumps for target and source for each case):
case1 (~20 MB): http:webyog.com/downloads/blogresources/datasynctest/datasync_testcase_1.zip
case2 (~3 MB): http:webyog.com/downloads/blogresources/datasynctest/datasync_testcase_2.zip
case3 (~2 MB): http:webyog.com/downloads/blogresources/datasynctest/datasync_testcase_3.zip
case4 (~350 KB): http:webyog.com/downloads/blogresources/datasynctest/datasync_testcase_4.zip
(one detail to observe is that those dumps do not contain a USE statement - be careful about where you import them!)

Download SQLyog and SJA from: http://webyog.com/en/downloads.php
(note that data synchronization is included in SQLyog Enterprise and Enterprise Trial versions only - and in SJA for Linux)

狀況效能測試 - php 創始人正在 OSDC 講的投影片

http://talks.php.net/show/osdctw09

strace your web server process

strace your web server process
% /usr/sbin/apache2 -X &
[1] 16367
(hit page a few times to warm up caches)
% strace -p 16367 -o sys1.txt
Process 16367 attached - interrupt to quit
(hit page once)
Process 16367 detached
% grep stat sys1.txt | grep -v fstat | wc -l
153

Alternative PHP Cache (APC

The Alternative PHP Cache (APC) is a free and open opcode cache for PHP. It was conceived of to provide a free, open, and robust framework for caching and optimizing PHP intermediate code.

Siege is an http regression testing and benchmarking utility

ABOUT SIEGE
Siege is an http regression testing and benchmarking utility. It was designed to let web developers measure the performance of their code under duress, to see how it will stand up to load on the internet. Siege supports basic authentication, cookies, HTTP and HTTPS protocols. It allows the user hit a web server with a configurable number of concurrent simulated users. Those users place the webserver "under siege."

PLATFORM SUPPORT
Siege was written on GNU/Linux and has been successfully ported to AIX, BSD, HP-UX and Solaris. It should compile on most System V UNIX variants and on most newer BSD systems. Because Siege relies on POSIX.1b features not supported by Microsoft, it will not run on Windows. Of course you can use Siege to test a Windows HTTP server.

NEWS & RELEASE INFORMATION
SIEGE-2.67 April 15, 2008 07:15 EDT This is the current stable release. It implements HTTP/1.1 cache validation. Since it's primary job is to simulate load, cache validation is off by default. You may activate it in your new .siegerc file. Developers, beta testers and adventurous souls, click HERE for the latest BETA version. It is basically a code cleanup. This is the final beta release. Look for version 2.66 soon.

"This is an excellent application that does something expensive suites don't. ie. It does what the users expects it to do, and does it simply."
-- Sgt. Scrub

Saturday, April 18, 2009

Simple is Hard

Speed up your web pages with YSlow

Speed up your web pages with YSlow
YSlow analyzes web pages and tells you why they're slow based on the rules for high performance web sites. YSlow is a Firefox add-on integrated with the popular Firebug web development tool. YSlow gives you:

An algorithm to find and resolve data differences between MySQL tables

An algorithm to find and resolve data differences between MySQL tables
with 10 comments
I’ve been designing an algorithm to resolve data differences between MySQL tables, specifically so I can ‘patch’ a replication slave that has gotten slightly out of sync without completely re-initializing it. I intend to create a tool that can identify which rows are different and bring them into sync. I would like your thoughts on this.
Background and requirements
I see this as the next step in my recent series of posts on MySQL tools and techniques to keep replication running reliably and smoothly. Sometimes slaves “drift” a little bit, even when there don’t seem to be any issues with replication (this is one reason I submitted a bug report to add checksums on binlog events). Once a table differs on the slave, it gets more and more different from the master, possibly causing other tables to differ too.
I need a tool that, given a table known to differ on master and slave(s), will efficiently compare the tables and resolve the differences. Finding tables that differ is easy with MySQL Table Checksum, but I am not sure the best way to find which rows differ.
Here are my requirements. The algorithm needs to be:
Designed for statement-based replication, which means no temp tables, no expensive queries that will propagate to the slave, and so forth.
Efficient in terms of network load and server load, both when finding and when resolving differences. No huge tables or un-indexed data, no high-impact INSERT.. SELECT locking, etc.
Efficient on the client-side where the tool is executed.
Must work well on “very large” tables.
Some things I assume:
Tables must have primary keys. Without primary keys, it’s hard or a waste of time at best, and a disaster at worst.
It is not a good idea to do this unless the fraction of rows that differ is very small. If much of the table is different, then mysqldump is a better idea.
Other tools I’ve found
I’ve found a number of tools that are either not complete or don’t quite address the need, but reading the source code has been very productive. There’s Giuseppe Maxia’s work in remote MySQL table comparison. I based the MySQL Table Checksum tool on some of this work. Read the comments on that link, and you’ll see some follow-up from Fabien Coelho, who wrote pg_comparator. The documentation for this tool is an excellent read, as it goes into great detail on the algorithm used.
There are also a few projects that don’t do what I’m looking for. datadiff does a two-way in-server comparison of two tables with OUTER JOIN, a fine technique but inherently limited to two tables on one server, and not practical for extremely large tables. coldiff is a more specialized variant of that tool. mysqldiff diffs the structure of two tables, which I mention for completeness though it is not the problem I’m trying to solve.
The Maxia/Coelho bottom-up algorithm
Without restating everything these smart people have written, here’s a high-level overview of the algorithm as presented by Maxia and Coelho:
Compute a “folding factor” based on the number of rows in the table and/or user parameters.
Build successive levels of checksum tables bottom-up, starting at a row-level granularity and decreasing granularity by the “folding factor” with each level, until the final table has a single row.
Each row in the first table contains key column(s), a checksum of the key column(s), and a checksum of the whole row.
Each row in an intermediate-level summary table contains checksums for a group of rows in the next more granular level of summary data.
Groups are defined by taking checksums from the previous level modulo the folding factor.
Beginning at the most aggregated level, walk the “tree” looking for the differences, honing in eventually to the offending rows.
The “folding factor” is really a “branching factor” for the tree of summary tables. If the factor is 128, each level in an intermediate summary table will contain the groupwise checksum of about 128 rows in the next most granular level summary table.
This algorithm has many strengths. For example, it uses a logarithmic search to find rows that differ. It makes no assumptions about key distributions; the modulo operation on the checksum should randomize the distribution of which rows need to be fixed. It’s also very generic, which means it works pretty much the same on all tables. There’s no need to think about the “best way to do it” on a given table.
I am concerned about a few things, though. There’s a lot of data in all these summary tables. The first summary table contains as many rows as the table to analyze. If I were to calculate and store these rows for a table with lots of relatively narrow rows, I might be better off just copying the whole table from one server to the other. Also, creating these tables is not replication-friendly; the queries that run on the master will run on the slave too. This might not be a problem for everyone, but it would not be acceptable for my purposes.
The second part of the algorithm, walking the “tree” of summary tables to find rows that differ, doesn’t use any indexes in the implementations I’ve seen. Suppose I have a table with 128 million rows I want to analyze on two servers, using a branching factor of 128 (the default). The first checksum table has 128 million rows; the second has 1 million, and so on. Repeated scans on these tables will be inefficient, and given the randomization caused by the summary checksums, will cause lots of random I/O. Indexes could be added on the checksum modulo branching factor, but that’s another column, plus an index — this makes the table even bigger.
The checksum/modulo approach has another weakness. It defeats any optimizations I might be able to make based on knowledge of where in the table the rows differ. If the differences are grouped at the end of the table, for example in an append-only table that just missed a few inserts on the slave, the algorithm will distribute the “pointers” to these corrupt rows randomly through the summary tables, even though the rows really live near each other. Likewise, if my table contains client data and only one client is bad, the same situation will happen. This is a major issue, especially in some large tables I work with where we do things a client or account at a time. These and other spatial and temporal locality scenarios are realistic, because lots of real data is unevenly distributed. The checksum/modulo approach isn’t optimal for this.
Finally, the bottom-up approach doesn’t allow for early optimization or working in-memory. It builds the entire tree, then does the search. There’s no chance to “prune” the tree or try to keep a small working set. The flip side of this is actually a strength: assuming that the whole tree needs to be built, bottom-up is optimal. But most of my data isn’t like that. If much of the table is corrupt, I’m going to do a mysqldump instead, so I want to optimize for cases where I’ll be able to prune the tree.
One solution: a top-down approach
Given that I won’t even be looking at a table unless the global checksum has already found it differs, I am considering the following top-down approach, or some variation thereof:
Generate groupwise checksums for the whole table in a top-level grouping (more on that later).
If more than a certain fraction of the groups differ, quit. Too much of the table is different.
Otherwise descend depth-first into each group that has differences.
I think this algorithm, with some tuning, will address most of my concerns above. In particular, it will allow a smart DBA to specify how the grouping and recursion should happen. The choice of grouping is actually the most complicated part.
I’d do this client-side, not server-side. I’d generate the checksums server-side, but then fetch them back to the client code and keep them in memory. Given a good grouping, this shouldn’t require much network traffic or memory client-side, and will avoid locks, eliminate scratch tables, and keep the queries from replicating.
In the best case, all other things being equal, it will require the server to read about as many rows as the bottom-up approach, but it will exploit locality — a client at a time, a day at a time, and so on. This is a huge help, in my opinion; reducing random I/O is a high priority for me.
Given all this, I think top-down is better if there are not many changes to resolve, or if they’re grouped tightly together.
Some of the weaknesses I see are complexity, a proliferation of recursion and grouping strategies, perhaps more network traffic, and susceptibility to edge cases. Whereas the bottom-up approach has identical best and worst cases for different distributions of corrupt rows (assuming the number of corrupt rows is constant), the top-down approach suffers if there’s no locality to exploit. I’m a bit worried about edge cases causing this to happen more than I think it ought to.
Finally, and this could be either a strength or weakness, this approach lets every level of the recursion have a different branching factor, which might be appropriate or not — the DBA needs to decide.
Smart grouping and recursion
I think the hardest part is choosing appropriate ways to group and “drill down” into the table. Here are some possible strategies:
Date groupings. We have a lot of data in InnoDB tables with day-first or week-first primary keys, which as you know creates a day-first or week-first clustered index. The first checksum I’d run on these tables would be grouped by day.
Numeric groupings. Tables whose primary key is an auto-incremented number would probably be best grouped by division, for example, floor(id/5000) to group about 5000 neighboring rows together at a time.
Character groupings. If the primary key is a character string, I might group on the first few letters of the string.
Drill-down. Take for example one of our tables that is primary-keyed on IDs, which are auto-incremented numbers, and client account numbers. The best way to do the table I’m thinking of is by account number, then numerically within that on ID. For the day-first table, I’d group by day, then account number, and then by ID.
Exploit append-only tables. If a table is append-only, then corruption is likely in the most recent data, and I might try to examine only that part of the table. If there are updates and deletes to existing rows, this approach might not work.
Use defaults if the DBA doesn’t specify anything. If there’s a multi-column primary key, recurse one column at a time. If a single-column key, look for another key whose cardinality is less, and recurse from that to the primary key instead.
I think the DBA will have to choose the best strategy on a table-by-table basis, because I can’t think of a good automatic way to do it. Even analyzing the index structures on the table, and then trying to decide which are good choices, is too risky to do automatically. For example, SHOW INDEX will show estimated index cardinalities, but they’re based on random dives into the index tree and can be off by an order of magnitude or more.
How to resolve the differences
Again assuming that this reconciliation is taking place between a master and slave server, it’s important to fix the rows without causing more trouble while the fixing happens. For example, I don’t want to do something that’ll propagate to another slave that’s okay, and thereby mess it up, too.
Fixing the rows on the master, and letting the fixes propagate to the slave via the normal means, might actually be a good idea. If a row doesn’t exist or is different on the slave, REPLACE or INSERT .. ON DUPLICATE KEY UPDATE should fix the row on the slave without altering it on the master. If the row exists on the slave but not the master, DELETE on the master should delete it on the slave.
Peripheral benefits of this approach are that I don’t need to set up an account with write privileges on the slave. Also, if more than one slave has troubles with the same rows, this should fix them all at the same time.
Issues I need to research are whether the different number of rows affected on the slave will cause trouble, and if this can be solved with a temporary slave-skip-errors setting. The manual may document this, but I can’t find it.
Next steps
I’m looking forward to your feedback, and then I plan to build a tool that’ll implement whatever algorithm emerges from that discussion. At this point, assuming the above algorithm is as good as we can come up with together, I’m planning to actually implement both top-down and bottom-up approaches in the tool, so the DBA can decide what to use. The tool will, like the rest of the scripts in the MySQL Toolkit, be command-line friendly (there are lots of proprietary “visual tools” to compare and sync tables, but they don’t interest me — plus, why would I ever trust customer data to something I can’t see source code for?). I also understand that not everyone has the same narrowly-defined use case of re-syncing a slave, so of course I’ll make the tool more generic.
For my own use, ideally I’ll be making sure the tool is rock-solid, then defining rules for tables that frequently drift, and running a cron job to automatically find which tables are different and fix them. If the MySQL Table Checksum tool finds a table is out of sync and I don’t have a rule for it, it’ll just notify me and not try to fix it.
Summary
In this article I proposed some ideas for a top-down, in-client, replication-centric way to compare a table known to differ on a master and slave, find the rows that differ, and resolve them. I’m thinking about building a tool to implement this algorithm, and would like your feedback on efficient ways to do this.
Written by Xaprb
March 5th, 2007 at 11:46 pm
Posted in SQL
« MySQL Table Checksum bug fix
What to do when MySQL says skip-innodb is defined »
10 Responses to 'An algorithm to find and resolve data differences between MySQL tables'
Subscribe to comments with RSS or TrackBack to 'An algorithm to find and resolve data differences between MySQL tables'.
Nice article!
We have a similar tool called SQLyog Job Agent which incorporates most of what you have discussed in this article. Unfortunately, it is not open-source.
We are always trying to improve the algorithm and look forward to more articles on this topic!

Rohit
6 Mar 07 at 2:32 am

Very interesting. I’ve already developed a simple tool to do just that, albeit based on a simple row-by-row comparison, with the correcting actions being inserts or updates directly on the slave.
It can recurse all the databases and tables to repair an entire database, or just operate on a single table.
I’d not considered doing the corrective action on the master, but it’s an interesting idea.
You’re absolutely correct that tables that lack a primary key are barely worth attempting, and so far my script ignores them.
One thing that I’ve found in practice, is that you must perform the synchronisation while replication is actually running. If you don’t, you will inevitably end up replicating ahead of the normal replication process and breaking it.
I’ve found that this tool is useful for bringing up a new replica where it’s impossible for the master to be affected in any ways, such as through table locking.
If you do develop a working implementation of your own, do let me know!
James

James Holden
6 Mar 07 at 7:48 am

Rohit, if I read your comment right, you’re subtly saying I’m going a good direction, which is encouraging :-)
James, you’ve reinforced my belief that lots of people need a tool that doesn’t disrupt replication.

Xaprb
6 Mar 07 at 8:57 am

I finally read the stuff (this article and the source code). The various discussions are very interesting.
I’m not sure that I understand fully the “index” issue with the bottom-up approach. Basically each summary table is build (once) and then it is scanned just once, so having an index built on some attribute would not be amortized. The only exception may be for bulk deletes or inserts, but that should not happen.
On the “exploit append-only tables” idea, the bottom-up approach can have a “where” clause on the initial table so that the comparison is only performed on part of the data. Moreover, if the candidate tuples are somehow an identifiable fraction of the table, it might be simpler to just
download them directly for comparison, that would be a third algorithm:-)
Do you have performance figures with your tool in different settings?

Fabien Coelho
15 May 07 at 5:06 am

Hello Fabien. The issue with the indexing is not scans, but lookups from a child table to its parent tables, including the group-by queries. These happen potentially many times. I could benchmark with and without indexes fairly easily and see for sure, but after writing all the queries I’m satisfied the index is important.
The WHERE clause has proven to be very important, as you guessed.
I did some testing with real data, and the results are here: Comparison of Table Sync Algorithms.

Xaprb
15 May 07 at 7:57 am

I have some problem with query sql.
I want you correct for me.
this is my query :
$queryselect=”select Distinct student.id_student,first_name,last_name,promotion,class
from student,applicationrecieve where student.id_student applicationrecieve.id_student”;
the query that I write to you, I want to select data that it don’t have in the table applicationrecieve from table student.
thanks,
regards,
chhivhorng

chhivhorng
27 Jun 07 at 5:57 am

Nice article. I’ve studied the algorithm before, but wasn’t so clearly.
Actually I met he same need for keeping my databases synchronized, and for some days I’m trying to build a suitable algorithm for that. The Top-Down algorithm is good, but, as you mentioned, is too hard to find a good grouping. And how what is to be done if there is only one indexed column as Primary Key? So I find the Bottom-Up more suitable for that purpose, but with some differences, I’m going to build a B*-Tree based on row checksums for each table, keep it saved locally (I think a XML structure is a good way) and, for saving time and traffic, do the comparison locally too. I’m not sure it’s the best way for that, but i want to try.
Best regards.

Negruzzi Cristian
9 Oct 07 at 8:38 am

Please take a look at MySQL Table Sync in the MySQL Toolkit (http:mysqltoolkit.sourceforge.net). It may save you a lot of work. I’ve implemented both algorithms there.

Xaprb
9 Oct 07 at 8:50 am

But by all means, explore your algorithm too! I don’t mean to say you shouldn’t. It may be a much better way.

Xaprb
9 Oct 07 at 8:52 am

i want the algorithm to find the table of any no.

paramjeet
14 Oct 08 at 8:01 am

Wednesday, April 29, 2009

Tuesday, April 28, 2009

Monday, April 27, 2009

Saturday, April 25, 2009

Friday, April 24, 2009

Thursday, April 23, 2009

Wednesday, April 22, 2009

Tuesday, April 21, 2009

Monday, April 20, 2009

Sunday, April 19, 2009

Saturday, April 18, 2009

Total Pageviews

Search This Blog

About Me

Recommended Articles

Labels

Blog Archive

Popular Posts

Recommend Sites