tag:blogger.com,1999:blog-272169222024-03-13T22:23:59.123-04:00Partha's BlogTidbits & info for the Power MySQL Database UserPartha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.comBlogger31125tag:blogger.com,1999:blog-27216922.post-32262025122940288532011-06-28T10:32:00.000-04:002011-06-28T10:32:54.978-04:00Effects of ALL_O_DIRECT on a TPCC workloadRecently Mark Callaghan blogged about <a href="http://www.facebook.com/note.php?note_id=10150219759210933">using O_DIRECT for the InnoDB transaction log</a>. He noted that there was not a significant performance gain if the number of concurrent connections increases. I've done my share of testing and retesting over the past months to determine how useful this is. Based on a small TPCC workload (100 warehouses, 64 connections, 1 hour test, 5 minute rampup), I've seen huge performance gains by setting ALL_O_DIRECT for the variable innodb_flush_method using Percona XtraDB.<br />
<br />
Without using Direct I/O, the benchmark generated a TpmC score of approximately 24,500 (HP DL160 G6, 2 x Xeon E5620 2.40GHz, 16GB mem, 4x300 GB SAS, RAID-10). After setting the variable to ALL_O_DIRECT, TpmC score went up to 48,000. Huge increase. This deserves some more investigation and some more testing. I want to also try this out on some older hardware to see if similar performance gains can be achieved.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com7tag:blogger.com,1999:blog-27216922.post-35023129349318899862011-05-20T18:22:00.000-04:002011-05-20T18:22:10.506-04:00Effect of changing InnoDB Page SizeWas bored so I started playing with what the performance impact of changing the innodb page size was. Interesting results from XtraDB (which makes it oh so easy to change the page size). I won't go into the gory details but the testing performed was against a machine with 6x146GB SAS drives and a read-only sysbench test (50 million records, 10GB innodb buffer pool):<br />
<br />
<div class="separator" style="clear: both; text-align: center;"><a href="http://4.bp.blogspot.com/-tqhGGH-VMEw/Tdbo1k10DyI/AAAAAAAAABo/Jf_r-qX8jH0/s1600/PageSizebench01_result.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="192" src="http://4.bp.blogspot.com/-tqhGGH-VMEw/Tdbo1k10DyI/AAAAAAAAABo/Jf_r-qX8jH0/s320/PageSizebench01_result.png" width="320" /></a></div><br />
The graph shows that the good old defaults work quite nicely. I'll have to get my hands on some machines with SSD drives and try this again :)Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com1tag:blogger.com,1999:blog-27216922.post-60366373478302667522011-04-23T03:27:00.000-04:002011-04-23T03:27:25.899-04:00Covering indexes in MySQL - revisited (with benchmark)In the process of building a new benchmark tool for Yahoo, I needed a good "guinea pig." I think I found the one by showing how much more powerful covering indexes can be with InnoDB. A covering index is one where the index itself contains all of the necessary data field(s). In other words, <b>no need to access the data page</b>!<br />
<br />
Here's a sample table:<br />
<pre>CREATE TABLE entity (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`entity_type` enum('Advertiser','Publisher','Network') NOT NULL DEFAULT 'Advertiser',
`managing_entity_id` int(10) unsigned DEFAULT NULL,
....
PRIMARY KEY (`id`),
KEY `ix_entity_managing_entity_id` (`managing_entity_id`),
....
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
</pre><br />
It's a wide table with 88 some odd columns with an average row length of 240 bytes.<br />
<br />
Now to test the QPS difference between a query like this:<br />
<pre>SELECT * FROM entity WHERE entity.managing_entity_id = X;
</pre><br />
and a query that uses a covering index:<br />
<pre>SELECT entity.managing_entity_id FROM entity WHERE entity.managing_entity_id = X;
</pre><br />
I have an SQL file that contains 2000 SQL statements with fairly random ids, and I'll run my benchmark program with 4 clients a few times to get a good QPS reading (throwing out the low & high numbers).<br />
<br />
Results with SELECT *:<br />
<pre>Tot Q=2000,Tot T=1.757571,QPS=1137.93411475269,Min=0.0006665,Max=0.027022,90th Pctl=0.000946
Tot Q=2000,Tot T=1.770522,QPS=1129.61036349732,Min=0.0006635,Max=0.026858,90th Pctl=0.00096675
Tot Q=2000,Tot T=1.753147,QPS=1140.80564835693,Min=0.00066325,Max=0.026966,90th Pctl=0.00095175
</pre><br />
Results with covering index:<br />
<pre>Tot Q=2000,Tot T=0.703581,QPS=2842.60092299252,Min=0.00026625,Max=0.014821,90th Pctl=0.00035325
Tot Q=2000,Tot T=0.897792,QPS=2227.6874821785,Min=0.00026625,Max=0.04546425,90th Pctl=0.0003595
Tot Q=2000,Tot T=0.720616,QPS=2775.403266094,Min=0.000257,Max=0.01566475,90th Pctl=0.0003452
</pre><br />
Fairly significant results. I've seen many times where SQL has been hastily written just to perform a simple check if a record exists or not. I've always tried to rationalize this with the developers I work with, but I never had the raw numbers to show the exact impact. If the query happens only a handful of times, not a huge deal. But, if it happens hundreds of thousands or millions of times per day, then that's a different story, and making a very simple rewrite of the SQL helps immensely, and could even save on cost, if you're thinking about adding another slave to spread out load.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com4tag:blogger.com,1999:blog-27216922.post-14362332568149767872011-04-07T16:18:00.000-04:002011-04-07T16:18:04.964-04:00Slamming the door shut on mysql server accessHas there been a situation when you needed to perform some serious maintenance work on your db server, and you just don't want anyone at all to connect to the db server?<br />
<br />
Here's a one liner: <pre>set global init_connect="kill connection_id();"</pre><br />
(I should have posted this 6 days ago)Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com3tag:blogger.com,1999:blog-27216922.post-59248458951572672032011-03-18T15:01:00.000-04:002011-03-18T15:01:07.937-04:00Planet MySQL RSS feeds not working?Not sure if its just me, but has anyone else been having trouble with the RSS feeds? From Safari, it looks like there are no new posts for the last 3 days. I thought it was Safari (after upgrading to 5.0.4) but I see the same issue with Google Reader?? Or have I suddenly developed a case of stupid-itus?Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com7tag:blogger.com,1999:blog-27216922.post-84117074086178685262011-02-26T05:47:00.000-05:002011-02-26T05:47:13.983-05:00Why the Facebook/Percona versions of MySQL are so much betterOne of my roles at Yahoo is to provide a rock solid infrastructure for MySQL-based projects. In order to provide that infrastructure, I have to satisfy many groups at Yahoo: Developers need an easy way of deploying the db server in their environment. Software Engineering (Operations) is looking for features to help scale MySQL, and DBAs are looking for more knobs and gauges to help tune the server. After about one year of attempting to manage the infrastructure, and following the MySQL sagas, my job is getting easier thanks to companies like Facebook and Percona as well as one of my predecessors (jcole) for releasing quality code that extends MySQL in so many ways than just raw performance. I've been able to take the best code from each company and mold together an internal version that satisfies all crowds. MySQL/Oracle, you should do the same.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com14tag:blogger.com,1999:blog-27216922.post-61556417792227925862011-02-15T02:35:00.000-05:002011-02-15T02:35:41.370-05:00Are you using the deadline scheduler? (Part 2)The deadline scheduler has many advantages over the cfq scheduler, which is the default in operating systems like RedHat Enterprise Linux. In my <a href="http://ralpartha.blogspot.com/2010/12/are-you-using-deadline-scheduler-part-1.html">previous post</a>, I quickly showed how much of a performance gain can be had by switching to the deadline scheduler. Now I will show some real performance numbers for different RAID configurations.<br />
<br />
<table><tr> <td></td> <td>2 disk RAID-0</td> <td>4 disk RAID-0</td> <td>6 disk RAID-5</td> <td>6 disk RAID-10</td> </tr>
<tr> <td>CFQ</td> <td>461.467</td> <td>947.067</td> <td>845.943</td> <td>862.763</td> </tr>
<tr> <td>Deadline</td> <td>851.067</td> <td>2876.933</td> <td>2145.866</td> <td>2580.430</td> </tr>
</table><br />
All of the tests were performed on a Dell PowerEdge 2950 with 2xQuad Core Xeon,16GB of memory and 6x146Gb SAS drives on a Perc/5 RAID controller, and all filesystems were standard EXT3. The TPCC benchmarks were conducted with a smallish buffer pool (2GB) and a 1GB log file size. The database is approximately 7GB in size (100 warehouses). I wanted to show what performance an I/O bound test would yield. The numbers here show that its possible just to get that added boost <b>without</b> resorting to re-creating the entire database with a different filesystem (like XFS). I will come back to XFS later as it provides the best performance.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com0tag:blogger.com,1999:blog-27216922.post-26330112331610339302010-12-19T18:18:00.002-05:002010-12-19T22:59:55.598-05:00Are you using the deadline scheduler? (Part 1)There have been many posts about performance, benchmarking and the results. Many DBAs have talked in the past about the <b>deadline</b> scheduler, available in all modern Linux distributions. The deadline scheduler is very effective for database systems, as it tries to prevent starvation of I/O requests. To see what the net effect of using the deadline scheduler looks like, I am using a simple TPCC benchmark program, created by the Percona team, <a href="https://code.launchpad.net/%7Epercona-dev/perconatools/tpcc-mysql">tpcc-mysql</a> to measure the number of New-Order transactions per minute (TpmC). All of the testing was performed using RedHat Enterprise Linux 5, using the standard ext3 file system and a combination of the default scheduler (cfq) and the deadline scheduler (dl):<br />
<br />
<div style="font-family: "Courier New",Courier,monospace;">CFQ: 845 TpmC</div><span style="font-family: "Courier New",Courier,monospace;">DL: 2145 TpmC</span><br />
<br />
This is a huge difference in performance between the two schedulers. How do you know if you are using the deadline scheduler? There are a few ways. The deadline scheduler can be enabled for all of your disks at boot time via the LILO or GRUB boot loaders. The scheduler can be changed by appending the string <span style="font-family: "Courier New",Courier,monospace;">elevator=deadline</span> to the boot line (file <span style="font-family: "Courier New",Courier,monospace;">/boot/grub/grub.conf</span>). If you issue the command <span style="font-family: "Courier New",Courier,monospace;">cat /proc/cmdline</span> this will display the kernel parameters that the system was booted with.<br />
<br />
The second way of switching schedulers is to change it "on the fly." You can list all of the available schedulers, as well as determine which scheduler is current by using this command: <span style="font-family: "Courier New",Courier,monospace;">cat /sys/block/drive/queue/scheduler</span> where drive is the actual drive represented to your system (i.e. sda, hda). The list of schedulers is displayed, and the current scheduler is noted within square brackets. To change the scheduler, echo the name of the new scheduler to the meta-file from the previous command (as user root). Be careful when switching schedulers on the fly, as there is a possibility to hand the system.<br />
<br />
In part 2, I'll go over my testing methodology and show how using the deadline scheduler with different types of RAID drive configurations and filesystem types affects the overall throughput of the system.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com4tag:blogger.com,1999:blog-27216922.post-498721195876358332010-12-15T18:23:00.003-05:002010-12-15T18:27:50.897-05:00MySQL 5.5 may be GA but not the docs.....<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_9RW-lu--p8I/TQlO3_wN3JI/AAAAAAAAABY/c3UB5RRqQ50/s1600/MySQL-GA-Docs.png"><img style="cursor:pointer; cursor:hand;width: 400px; height: 197px;" src="http://4.bp.blogspot.com/_9RW-lu--p8I/TQlO3_wN3JI/AAAAAAAAABY/c3UB5RRqQ50/s400/MySQL-GA-Docs.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5551054739624287378" /></a>Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com4tag:blogger.com,1999:blog-27216922.post-32151778897062408952010-10-13T10:11:00.001-04:002010-10-13T10:36:26.261-04:00Woes of ROW based logging implementationI have been trying to find ways to implement ROW based logging at our company, as it provides better reliability and far less chances for a slave going "out-of-sync" with a master. One of the big issues that I faced was constant replication lag from one datacenter to another because of the massive amounts of data that can potentially be generated just from one single SQL statement.<br /><br />With the traditional STATEMENT based replication, one SQL statement is written to the binary log - very little network overhead there transferring that across the wire to another datacenter. But if that single SQL statement changes 20,000 rows, well that's where agony begins, and business continuity takes a beating.<br /><br />And to compound situations even further, more and more operations are suddenly becoming "unsafe for STATEMENT based logging", generating hundreds upon thousands of warning statements in error log files. With 5.1.50, LOAD DATA INFILE statements generate warnings now. This leads me to believe that at some point, these warnings will soon become unsupported operations. Blech! Does not give me the warm fuzzy feeling I used to have with MySQL. But I will still keep trying to find a real-world solution to this problem.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com8tag:blogger.com,1999:blog-27216922.post-53925122262420056212010-09-22T09:32:00.003-04:002010-09-22T09:50:05.919-04:00Extracting load files from the binary logThere are times when you may be rebuilding a DB server by replaying the binary logs using the <code>mysqlbinlog</code> utility. Extracting CRUD statements and DDL is relatively straightforward, but not for statements like <code>LOAD DATA INFILE.</code> The actual data file is embedded within the binary log, and not very visible to the naked eye. But there is an easy way to decipher the binary log and extract the file to load manually.<br /><br />As an example, I have taken a simple text file of numbers and loaded it into a fictitious table abc using the <code>LOAD DATA LOCAL INFILE</code> command. To see where in the binary log that command would reside, the <code>mysqlbinlog</code> utility is used:<br /><br /><pre><br />$ mysqlbinlog mysqld-bin.000003 | grep -i -B7 "load data"<br /><br /># at 174<br />#100921 21:42:10 server id 1136902037 end_log_pos 218<br />#Begin_load_query: file_id: 1 block_len: 21<br /># at 218<br />#100921 21:42:10 server id 1136902037 end_log_pos 432 Execute_load_query thread_id=5 exec_time=0 error_code=0<br />use test/*!*/;<br />SET TIMESTAMP=1285130530/*!*/;<br />LOAD DATA LOCAL INFILE '/tmp/SQL_LOAD_MB-1-2' INTO TABLE `abc` FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\' LINES TERMINATED BY '\n' (`a`)<br /><br />$<br /></pre><br />You can see that from the command above that the command, including the load file is contained between positions 174 and 432 of the binary log. Now that the start/stop positions are known, it is possible to extract the data file to load manually into your database (or into another database):<br /><pre><br />mysqlbinlog --start-position=174 --stop-position=432 --local-load=/var/tmp mysqld-bin.000003<br /></pre><br />The <code>--local-load</code> option specifies what directory to store the actual load file. And you can take the file and use the <code>LOAD DATA</code> command above (changing the directory name as needed) and load that data back into your database or use it to seed another database.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com0tag:blogger.com,1999:blog-27216922.post-62550609543495618522009-11-29T08:43:00.001-05:002009-11-29T08:45:16.617-05:00I did not fall off the face of the earth....It's been a looong time since I've posted anything. For the past year, I've been focused on operations and streamlining DBA tasks, as the group's responsibilities continues to grow. Its one thing to manage 10-20 production MySQL database servers, but when the number starts climbing to 160-200, things start getting interesting. For 2010, I expect that number to double. Performance is key, but more important is reliability, uptime, monitoring and notification. Dashboards are a good start, but the most important subsystem will be monitoring. How scalable does the system need to be? For 10-20 off-the-shelf products work fine. But when thousands of systems need to be monitored, then it starts getting interesting. I'll share my thoughts along the way as far as how we are handling this type of growth. Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com1tag:blogger.com,1999:blog-27216922.post-80725688294472668112008-10-25T18:18:00.004-04:002008-10-25T18:44:17.120-04:00Should you be worried about STATEMENT based replication?Earlier this month, an <a href="http://www.planetmysql.org/entry.php?id=15435">announcement about STATEMENT based binary logging</a> would be the default starting with MySQL version 5.1.29. I've always preached that backwards compatibility was key to new releases. In this case, lessons were not learned until close to final GA date.<br /><br />I would like to point out that for 90% of customer cases, STATEMENT based replication will work fine as advertised. But I'd like to point out some use cases where STATEMENT based replication will be at best spotty (at least it is in 5.1.28).<br /><br />If you primarily use InnoDB as your storage engine you will want to pay close attention to your transaction isolation level. There is a minimum requirement that READ COMMITED level be used, otherwise statement based replication can not be used.<br /><br />Partitioning + InnoDB + STATEMENT-based binlog also has its problems. We faced constant issues, getting the error 'Binary logging not possible. Message: Statement-based format required for this<br />statement, but not allowed by this combination of engines'. What the heck does this mean?<br /><br />It's a misleading error message. Partitioning is not a true storage engine, but is a virtual one. The first time an underlying table is opened, the partition engine caches the table flags of the real storage engine, in this case InnoDB. Lets say for example, an app performs a SELECT on a partitioned InnoDB table. Let's also assume that the transaction isolation level is READ UNCOMMITTED. The SELECT will execute without any issues. But, try to insert a record from a different session, and it will fail every time. I filed this <a href="http://bugs.mysql.com/bug.php?id=39084">BUG#39084</a> over a month and a half ago. Repeatable test cases were also given.<br /><br />Since we are committed to releasing our reporting database and can not wait for MySQL to come around and realize mistakes they made, we came up with our own patch that addresses this immediate concern, and for this reporting db.<br /><br />So, should you be worried? I think I would be.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com11tag:blogger.com,1999:blog-27216922.post-57001004721768181492007-12-02T10:41:00.000-05:002007-12-02T11:05:07.990-05:00Wishlist for partitioningI love the way partitioning works in MySQL. I remember in the past how many projects I implemented using application logic to parallelize I/O. Partitioning makes this seamless now. But it's not without its share of problems and workarounds. So I compiled my own wishlist that hopefully might make it into a future version of MySQL.<br /><br />1. Partition level table locking. Partitions should be treated like tables and locked individually rather than the who table and all of its partitions.<br />2. Ability to add partitions from existing tables. This is very ueful, especially when trying to perform bulk maintainance operations.<br />3. Ability to convert a partition to a table.<br />4. Be able to mix and match storage engines for partitions and subpartitions. How cool would it be to have an archive partition for older data reside using ARCHIVE tables while the remaining partitions are InnoDB or MyISAM.<br />5. More usuable datatypes for partition pruning. How many times can you use datetme when timestamp is available. Also when would functions other than TO_DAYS and YEAR be supported?Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com2tag:blogger.com,1999:blog-27216922.post-192437593199506112007-06-26T21:08:00.000-04:002007-06-26T23:19:24.645-04:00Version 3 of mysqlbackup - small bug fixI just posted version 3 of <a href="http://forge.mysql.com/snippets/download.php?id=14">mysqlbackup</a> to <a href="http://forge.mysql.com">MySQL Forge</a>.<br /><br />Small bugfix: Added option --add-drop-table to the default options for mysqldump. This was causing a failure in restoring views.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com2tag:blogger.com,1999:blog-27216922.post-50647370600170868392007-06-17T17:13:00.000-04:002007-06-17T17:29:34.706-04:00Not all MySQL errors are visible to replicationThis probably warrants a bug report to MySQL, but I want to let other people know about this first. There are situations where MySQL receives incomplete statements from replication relay logs, but does not trigger a replication error. Case in point is exceeding <span style="font-weight: bold;font-family: courier new">max_allowed_packet</span>.<br /><br />I recently had a situation where once of my machines was incorrectly configured with a different value for <span style="font-weight: bold;font-family: courier new">max_allowed_packet</span>. What happened is not what I had expected. Instead of receiving a replication error (which we monitor for using Nagios), the MySQL error log was spewing with messages about exceeding max_allowed_packet. Instead, the only visible problem through our monitoring framework was that replication had fallen behind, and was continuing to fall behind.<br /><br />Fixing the problem was rather easy: stop the slave, change the max_allowed_packet variable globally in the db server and in the configuration file, and then start the slave.<br /><br />This is one of those things that falls under the category "MySQL annoyances and one-offs". Shouldn't this really trigger a true replication error, rather than spewage in log files? I will have to reproduce this and then file a bug report to MySQL, but I really shouldn't have to if there was some consistency in error reporting.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com1tag:blogger.com,1999:blog-27216922.post-19767450853309386132007-06-17T10:17:00.000-04:002007-06-17T10:22:43.717-04:00Script to backup binary logs on a masterI have recently posted a script on <a href=http://forge.mysql.com>MySQL Forge</a> to back up MySQL binary logs. One of the ideas that I had when I originally wrote the script was to take into account all of the slaves and what master log file & position that each one has executed. This way, only the relevant binary logs would get archived and then subsequently purged. You can find the script <a href=http://forge.mysql.com/snippets/view.php?id=73>here</a>.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com1tag:blogger.com,1999:blog-27216922.post-50039320873555531312007-06-17T09:27:00.000-04:002007-06-17T10:05:12.286-04:00Reload data quickly into MySQL InnoDB tablesAs DBAs that manage large quantities of database servers, we are always looking for the fastest or most efficient way to load data into the database. Some DBAs have quarterly maintenance periods where they reload data into a database to refresh the indexes.<br /><br />If you primarily use InnoDB tables in your MySQL database server, then these set of tricks will help in trying to make the reload process a bit faster than just a straight dump & reload.<br /><br /><span style="font-weight: bold;">my.cnf configuration</span><br /><span style="font-family:courier new;">innodb_flush_log_at_trx_commmit = 0<br />innodb_support_xa = 0<br />skip-innodb-doublewrite<br />disable log-bin & log_slow_queries<br /><span style="font-family:times new roman;"><br />Since the goal is to reload data quickly, we need to eliminate any potential bottlenecks. Setting <span style="font-family:courier new;">innodb_flush_log_at_trx_commit = 0<span style="font-family:times new roman;"> this will reduce the amount of disk I/O by avoiding a flush to disk on each commit. If you are not using XA compliant transactions (multi system two-phase commits) then you won't need this option set. This will avoid an extra disk flush before the transaction starts. The <span style="font-family:courier new;">skip-innodb-doublewrite</span> option will turn off the use of this buffer for inserts, which will actually eek out a little bit more performance. Also if you don't need to use the binary log, turn it off during your reload period. Remember any excess disk I/O that is not needed will hurt in the performance of reloading the database.<br /></span></span></span></span><br /><br /><span style="font-weight: bold;">Unloading the data</span><br />There are many ways to unload & reload the data using the standard MySQL tools or your own crafted toolset. Again the main idea is efficiency. The best advice here is while selecting the data to be unloaded, make sure that the select is in <span style="font-weight: bold;">primary key order.</span> If the data is sorted ahead of time, it loads pretty fast back into InnoDB as the primary key is a clustered index, meaning that the data is sorted based on the primary key as it is inserted into the database. Use the <span style="font-family:courier new;">--order-by-primary</span> option of the mysqldump utility while selecting the data.<br /><br />I hope these small tips help you make the process a bit less painful.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com1tag:blogger.com,1999:blog-27216922.post-89209685788999010242007-05-10T17:22:00.000-04:002007-05-10T17:26:44.591-04:00Why is QA so important?<a href="http://bp1.blogger.com/_9RW-lu--p8I/RkONh37BqGI/AAAAAAAAAAo/Z1yc-p8QwZk/s1600-h/addedsql.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://bp1.blogger.com/_9RW-lu--p8I/RkONh37BqGI/AAAAAAAAAAo/Z1yc-p8QwZk/s400/addedsql.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5063046019181291618" /></a><br /><br />Someone should really check the label before mass producing. Doesn't this look like SQL from SQL Server????Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com0tag:blogger.com,1999:blog-27216922.post-51809718199239910352007-02-28T20:50:00.000-05:002007-02-28T21:01:33.481-05:00A case of no research before product rolloutI am digressing a bit from my usual topics, but this this pretty hilarious....<br /><br />Wanted to get a bit of an after dinner snack and I find some corn dogs in the freezer. Yum!<br /><br />Pop one out, read the instructions on the box (yes, RTFM!!!) and what do I see? Instructions that read <b>Heat for 70 seconds</b>.<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp3.blogger.com/_9RW-lu--p8I/ReYziymcpqI/AAAAAAAAAAU/fFZx8QbzDxs/s1600-h/cord-dog-stupidity.JPG"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://bp3.blogger.com/_9RW-lu--p8I/ReYziymcpqI/AAAAAAAAAAU/fFZx8QbzDxs/s400/cord-dog-stupidity.JPG" alt="" id="BLOGGER_PHOTO_ID_5036769906052081314" border="0" /></a><br /><br />When was the last time you have ever set the timer on a microwave oven for <b>70 seconds</b>?????!!!!! Would it have taken just too much time to do some product research and realize that you should set the microwave for <b>1 minute 10 seconds</b>?<br /><br />So is there a moral to this story? Kind of. Documentation is not to be taken lightly, no matter how trivial it may seem to be.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com0tag:blogger.com,1999:blog-27216922.post-75101610488419121522007-02-04T19:08:00.000-05:002007-02-04T19:19:58.435-05:00MySQL and iSCSI - a winning combo!So it's been a very long time again between posts. So much has happened. Let me first begin by saying that I am very impressed with iSCSI performance and I believe that it is mature enough to actually run production workloads (but it really depends on the type of workload).<br /><br />After all of the benchmarking and analysis, we finally decided on moving forward with a purchase of an iSCSI storage solution. For the types of queries we run (large amount of records to scan, small resultset returned) we had to tweak the schema just a bit in order to realize the performance that we desired (that plus good quality fibre-channel drives to get that extra oomph that's needed).<br /><br />Bottom line is we had to make a significant investment in hardware in order to realize the benefits of having a proper storage solution in place. The benefits though outweigh the overwhelming maintenance required to keep all of the machines running. Backups using the storage provider's snapshot mechanism will be extremely beneficial as well. All in all, a good decision to ease our minds.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com1tag:blogger.com,1999:blog-27216922.post-51186154333093394342006-12-17T09:18:00.000-05:002006-12-17T09:54:47.568-05:00Primary Key Order Does Matter!There have been a few posts on <a href="http://www.planetmysql.org/">PlanetMySQL</a> regarding primary keys and the importance of choosing the right one. This is even more important when the table uses InnoDB. You've read different posts of why it is so important. Now, I'm all about benchmarks and showing the details. So I'll take a table from my previous posts about MySQL 5.1 partitioning and show what I found.<br /><br />This table was created under MySQL 5.1.12-beta:<br /><pre><br />CREATE TABLE `big_table_test1` (<br />`entity_id` int(11) NOT NULL DEFAULT '0',<br />`col1` int(11) NOT NULL DEFAULT '0',<br />`col2` int(11) NOT NULL DEFAULT '0',<br />`col3` int(11) NOT NULL DEFAULT '0',<br />`col4` int(11) NOT NULL DEFAULT '0',<br />`col5` int(11) NOT NULL DEFAULT '0',<br />`col6` int(11) NOT NULL DEFAULT '0',<br />`ymdh` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',<br />`imps` bigint(20) NOT NULL DEFAULT '0',<br />`clicks` int(11) NOT NULL DEFAULT '0',<br />`convs` int(11) NOT NULL DEFAULT '0',<br />`id` int(10) unsigned NOT NULL DEFAULT '0',<br />PRIMARY KEY (`id`,`ymdh`),<br />KEY `ix_big1` (`ymdh`,`entity_id`,`col3`) USING BTREE,<br />KEY `ix_big2` (`ymdh`,`entity_id`,`col4`) USING BTREE,<br />KEY `ix_big3` (`ymdh`,`entity_id`,`col2`) USING BTREE<br />) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT<br /></pre><br /><br />I loaded about 180 million records into this table (a small set of data for us!) and ran one of our really popular types of queries:<br /><br /><br /><pre><br />SELECT col1,col2,col3,SUM(imps),SUM(clicks),SUM(convs)<br />FROM big_table_test1<br />WHERE ymdh IN ('2006-10-01 00:00:00','2006-10-02 00:00:00','2006-10-03 00:00:00',<br /> '2006-10-04 00:00:00','2006-10-05 00:00:00','2006-10-06 00:00:00',<br /> '2006-10-07 00:00:00')<br />AND entity_id = 2<br />GROUP BY col1, col2, col3<br />ORDER BY col1, col2, col3;<br /></pre>Doesn't look terribly nasty does it? This query takes about <span style="font-weight: bold;">7 MINUTES</span> to run!!! <span style="font-weight: bold;">EXPLAIN</span> on the query shows nothing out of the ordinary, as it uses one of the secondary indexes on the table. The cardinality of <code>entity_id</code> is not really high, so forcing one of the secondary indexes over another wouldn't yield any performance benefits. The <code>id</code> column is basically a numerical hash of the tables "real" primary key, which is entity_id plus col1 through col6, and is used for uniqueness. What's interesting is that throughout our application, there are no direct queries against this id column. It just exists. But, it can't be removed.<br /><br />If this column really serves no really significant value, what if we swapped the order of the definition of the primary key? So the definition of the primary key looks like:<br /><pre><br />PRIMARY KEY (`ymdh`,`id`)<br /></pre><br />Logically, no difference so we do not break any uniqueness constraints in the application. If we run the query again, <span style="font-weight: bold;">4 SECONDS!!!!</span> Wow! How do we explain this massive performance increase?<br /><br />Remember that InnoDB uses a clustered index for the primary key. Clustered indexes are indexes that are built based on the same key by which the data is ordered on disk. They are very efficient during scanning, but have performance implications when inserting new data, as some re-ordering may need to be done. All of our data is inserted in <code>ymdh</code> column order, so it makes sense if the primary key was based on this column. There are a lot of efficiencies that can be obtained, such as sequential disk read-ahead. The previous index for the primary key needs lots of random disk I/O to read the data portion of the table.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com5tag:blogger.com,1999:blog-27216922.post-58488772961685925422006-12-04T09:54:00.001-05:002006-12-04T09:54:21.227-05:00MySQL Partitioning 5.1 - Part 5 Slowdown Problem Solved!Finally figured out what was causing the lack of performance on the<br>partitions with the query. The use of FORCE INDEX was causing the slowdown<br>with the partitioned table. Once I removed the clause, the query ran in 1<br>minute 19 seconds, which is more in line with expectations. Sorry for any<br>inconvenience!Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com1tag:blogger.com,1999:blog-27216922.post-65912498957689812242006-12-04T02:39:00.000-05:002006-12-04T03:55:37.564-05:00MySQL 5.1 Partitioning - Part 4 (Results)In my previous post I started out by setting up MySQL 5.1.12 on a box in order to test the performance of the new partitioning option. During testing, I noted that I did not see any noticeable performance improvements from using partitioning. So I spent some time Saturday and Sunday (I guess I don't have anything else better to do!) to build the testing environment and perform the tests. So I was wrong, but only slightly. Partitioning does show better performance than standard tables, but not by as much as you would think. But, wait, there is light at the end of the tunnel (as well as a WTF). The numbers...<br /><br /><table border="4" frame="border" width="33%"><br /> <tr><br /> <th align="left">Table Type</th><br /> <th align="left">Elapsed Time</th><br /> </tr><br /> <tr><br /> <td>Normal</td><br /> <td>7 minutes, 41 seconds</td><br /> </tr><br /> <tr><br /> <td>Partitioned</td><br /> <td>5 minutes, 51 seconds</td><br /> </tr><br /></table><br /><br />By partitioning, we gain about 25% improvement in performance. Not bad, but not great. I would have expected more than 25%, given that I loaded approximately 110 million records for the test, and there are approximately 650,000 records per day for the range for the test.<br /><br />Here are the details of testing:<br /><ul><br /> <li>Machine is an HP DL385 with 16GB of memory, 1 73GB 15k RPM SCSI drive for boot, root and /var partitions. 3 146GB 15k RPM SCSI drives in RAID-0 stripe (Hardware RAID).<br /> <li>Using CentOS 4.3 as operating system<br /> <li>MySQL version 5.1.12-beta (Compiled from SRPM)<br /> <li>InnoDB Parameters<br /> <ul><br /> <li><code>innodb_buffer_pool_size = 11G</code><br /> <li><code>innodb_file_per_table</code><br /> <li><code>innodb_open_files = 1000</code><br /> <li><code>innodb_log_file_size = 2000M</code><br /> <li><code>innodb_flush_method = O_DIRECT</code><br /> <li><code>innodb_flush_log_at_trx_commit = 1</code><br /> <li><code>innodb_support_xa = 0</code><br /> </ul><br /> <li> Prior to each test run, the database server was restarted, and mysql was run with the database name (It takes approximately 40 seconds for mysql to run the first time as it caches the table information)<br /> <li>The test query was run 3 times (each with a db server restart) and the average of the 3 runs taken.<br /> <li> SQL (Non-partitioned): <code>select entity_id,buyer_entity_id,buyer_line_item_id,sum(roi_cost) from network_daily_local_nonpart force index (ix_nsdl_ymdh_entity_seller) where ymdh in ('2006-11-01 00:00:00', '2006-11-02 00:00:00', '2006-11-03 00:00:00', '2006-11-04 00:00:00', '2006-11-05 00:00:00', '2006-11-06 00:00:00', '2006-11-07 00:00:00', '2006-11-08 00:00:00', '2006-11-09 00:00:00', '2006-11-10 00:00:00', '2006-11-11 00:00:00', '2006-11-12 00:00:00')and entity_id = 2 group by entity_id,buyer_entity_id,buyer_line_item_id order by entity_id,buyer_entity_id,buyer_line_item_id</code><br /> <li> SQL (Partitioned): <code>select entity_id,buyer_entity_id,buyer_line_item_id,sum(roi_cost) from network_daily_local_part force index (ix_nsdl_ymdh_entity_seller) where ymdh in ('2006-11-01 00:00:00', '2006-11-02 00:00:00', '2006-11-03 00:00:00', '2006-11-04 00:00:00', '2006-11-05 00:00:00', '2006-11-06 00:00:00', '2006-11-07 00:00:00', '2006-11-08 00:00:00', '2006-11-09 00:00:00', '2006-11-10 00:00:00', '2006-11-11 00:00:00', '2006-11-12 00:00:00')and entity_id = 2 group by entity_id,buyer_entity_id,buyer_line_item_id order by entity_id,buyer_entity_id,buyer_line_item_id</code><br /></ul><br /><br />Now, one thing at <a href="http://www.rightmedia.com">RightMedia</a>, is we have LOTS of data. In fact there is some skew on the <code>entity_id</code> column and the majority of the records have the value listed in the SQL above. That still should not explain huge increases in performance, should it?<br /><br />Okay, one last experiment (It's 3:46AM EST!!) I am going to subpartition the partitioned table into 7 subpartitions, hash by the <code>entity_id</code> column.<br /><br />Result: <strong>14.32 seconds!</strong> Holy friggin' turbo mode Batman! It's way too late to be trying to quantify performance gain, so I'm quantifying it by <strong>WTF</strong> units. I'd really like to get an explanation of what is really going on behind the scenes that yields this type of performance gain, and why I don't see at least 1 or 2 WTF units of gain on the original partitioned table, given that the amount of data is much less for the date range compared to the whole table.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com3tag:blogger.com,1999:blog-27216922.post-61957591320951965762006-12-03T13:53:00.000-05:002006-12-03T13:54:02.531-05:00MySQL 5.1 Partitioning - Part 3Finally, it's time to start putting MySQL 5.1.12-beta through the wringer. First order of business, convert the existing table schema to one that supports partitioning...<br /><!--break--><br />I made some minor changes to the configuration for partitioning, namely <code>innodb_file_per_table</code> and <code>innodb_open_files</code>. I set <code>innodb_open_files</code> to 1000 based on the tables and partitions I plan on supporting.<br /><br />This is what the new table schema looks like with partitioning:<br /><pre><br />CREATE TABLE `network_daily` (<br /> `entity_id` int(11) NOT NULL default '0',<br /> `buyer_entity_id` int(11) NOT NULL default '0',<br /> `buyer_line_item_id` int(11) NOT NULL default '0',<br /> `seller_entity_id` int(11) NOT NULL default '0',<br /> `seller_line_item_id` int(11) NOT NULL default '0',<br /> `size_id` int(11) NOT NULL default '0',<br /> `pop_type_id` int(11) NOT NULL default '0',<br /> `country_group_id` int(11) NOT NULL default '0',<br /> `is_adjustment` tinyint(4) NOT NULL default '0',<br /> `adv_learn_type` char(1) NOT NULL default '',<br /> `pub_learn_type` char(1) NOT NULL default '',<br /> `frequency` smallint(6) NOT NULL default '0',<br /> `ymdh` datetime NOT NULL default '0000-00-00 00:00:00',<br /> `imps` bigint(20) NOT NULL default '0',<br /> `clicks` int(11) NOT NULL default '0',<br /> `convs` int(11) NOT NULL default '0',<br /> `id` int(10) unsigned NOT NULL default '0',<br /> `checkpoint` int(11) default NULL,<br /> PRIMARY KEY (`id`,`ymdh`),<br /> KEY `ix_nsl_ymdh_buyerli` (`ymdh`,`buyer_line_item_id`),<br /> KEY `ix_nsdl_ymdh_entity_buyer` (`ymdh`,`entity_id`,`buyer_entity_id`),<br /> KEY `ix_nsdl_ymdh_entity_seller` (`ymdh`,`entity_id`,`seller_entity_id`)<br />) ENGINE=InnoDB<br />PARTITION BY RANGE (TO_DAYS(`ymdh`))<br />(<br /> PARTITION p2005 VALUES LESS THAN (TO_DAYS('2006-01-01')),<br /> PARTITION p200601 VALUES LESS THAN (TO_DAYS('2006-02-01')),<br /> PARTITION P200602 VALUES LESS THAN (TO_DAYS('2006-03-01')),<br /><br /> ...<br /><br /> PARTITION P200609 VALUES LESS THAN (TO_DAYS('2006-10-01')),<br /> PARTITION P20061001 VALUES LESS THAN (TO_DAYS('2006-10-02')),<br /> PARTITION P20061002 VALUES LESS THAN (TO_DAYS('2006-10-03')),<br /> PARTITION P20061003 VALUES LESS THAN (TO_DAYS('2006-10-04')),<br /><br /> ...<br /><br /> PARTITION P20061130 VALUES LESS THAN (TO_DAYS('2006-12-01'))<br />);<br /></pre><br />There are a lot of partitions that I have defined. I need to keep a rolling 60 days of daily partitions active. The plan is to use the <code>ALTER TABLE REORGANIZE PARTITION</code> statement to merge the older partitions together once per day, and to add a new partition once per day.<br /><br />After the data was reloaded, it was time to test the performance, and the ability to add partitions and reorganize them (this was broken in version 5.1.11-beta).<br /><br />Performance, surprisingly wasn't what I expected. Queries that ran on the partitions were about the same performance wise as those on the unpartitioned table. Hmm, I have to double check my results. I'll post all of the performance data.<br /><br />The real good news is that the partition maintenance commands all worked with InnoDB! Dropping & Reorganizing partitions worked perfectly. I'll have to redo my testing and to see what happened to the performance.Partha Duttahttp://www.blogger.com/profile/00972126296305304386noreply@blogger.com7