Wednesday, September 22, 2010

Extracting load files from the binary log

There are times when you may be rebuilding a DB server by replaying the binary logs using the mysqlbinlog utility. Extracting CRUD statements and DDL is relatively straightforward, but not for statements like LOAD DATA INFILE. The actual data file is embedded within the binary log, and not very visible to the naked eye. But there is an easy way to decipher the binary log and extract the file to load manually.

As an example, I have taken a simple text file of numbers and loaded it into a fictitious table abc using the LOAD DATA LOCAL INFILE command. To see where in the binary log that command would reside, the mysqlbinlog utility is used:


$ mysqlbinlog mysqld-bin.000003 | grep -i -B7 "load data"

# at 174
#100921 21:42:10 server id 1136902037 end_log_pos 218
#Begin_load_query: file_id: 1 block_len: 21
# at 218
#100921 21:42:10 server id 1136902037 end_log_pos 432 Execute_load_query thread_id=5 exec_time=0 error_code=0
use test/*!*/;
SET TIMESTAMP=1285130530/*!*/;
LOAD DATA LOCAL INFILE '/tmp/SQL_LOAD_MB-1-2' INTO TABLE `abc` FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\' LINES TERMINATED BY '\n' (`a`)

$

You can see that from the command above that the command, including the load file is contained between positions 174 and 432 of the binary log. Now that the start/stop positions are known, it is possible to extract the data file to load manually into your database (or into another database):

mysqlbinlog --start-position=174 --stop-position=432 --local-load=/var/tmp mysqld-bin.000003

The --local-load option specifies what directory to store the actual load file. And you can take the file and use the LOAD DATA command above (changing the directory name as needed) and load that data back into your database or use it to seed another database.