Wednesday, October 13, 2010

Woes of ROW based logging implementation

I have been trying to find ways to implement ROW based logging at our company, as it provides better reliability and far less chances for a slave going "out-of-sync" with a master. One of the big issues that I faced was constant replication lag from one datacenter to another because of the massive amounts of data that can potentially be generated just from one single SQL statement.

With the traditional STATEMENT based replication, one SQL statement is written to the binary log - very little network overhead there transferring that across the wire to another datacenter. But if that single SQL statement changes 20,000 rows, well that's where agony begins, and business continuity takes a beating.

And to compound situations even further, more and more operations are suddenly becoming "unsafe for STATEMENT based logging", generating hundreds upon thousands of warning statements in error log files. With 5.1.50, LOAD DATA INFILE statements generate warnings now. This leads me to believe that at some point, these warnings will soon become unsupported operations. Blech! Does not give me the warm fuzzy feeling I used to have with MySQL. But I will still keep trying to find a real-world solution to this problem.