Wednesday, October 13, 2010

Woes of ROW based logging implementation

I have been trying to find ways to implement ROW based logging at our company, as it provides better reliability and far less chances for a slave going "out-of-sync" with a master. One of the big issues that I faced was constant replication lag from one datacenter to another because of the massive amounts of data that can potentially be generated just from one single SQL statement.

With the traditional STATEMENT based replication, one SQL statement is written to the binary log - very little network overhead there transferring that across the wire to another datacenter. But if that single SQL statement changes 20,000 rows, well that's where agony begins, and business continuity takes a beating.

And to compound situations even further, more and more operations are suddenly becoming "unsafe for STATEMENT based logging", generating hundreds upon thousands of warning statements in error log files. With 5.1.50, LOAD DATA INFILE statements generate warnings now. This leads me to believe that at some point, these warnings will soon become unsupported operations. Blech! Does not give me the warm fuzzy feeling I used to have with MySQL. But I will still keep trying to find a real-world solution to this problem.


Matthew Montgomery said...

Use binlog_format=MIXED.

MySQL will write statement based binlog by default and switch over to ROW format when the statement is seen as unsafe. These "unsafe for STATEMENT based logging", messages should give you a warmer and fuzzier feeling since they are now caught and before 5.1 those statements could potentially result in the silent slave corruption. Now you are warned when the statement is non-deterministic and needs to use ROW based format. Those non-deterministic operations will still be supported in the future. If anything they will force use of ROW format.

For the case of a query that will modify thousands of rows, you can dynamically set the binlog_format for your session if either STATEMENT or ROW would be better. Sadly, to change the binlog_format value, you must have the SUPER privilege.

Partha Dutta said...

Matthew, that does not help the slave lag situation though

Kedar said...

I have seen two different projects with ROW and STATEMENT based replications resp.
I see more troubles (lags & errors) in ROW ones.
I saw "type" of bugs [delete / update failed] reported in ROW based replication only.
I personally like STATEMENT based only as it's easy to debug as well; for ROW based debugging you may read this.

sbester said...

Have you tried using slave_compressed_protocol=1 in my.cnf ?

Anonymous said...

so you should just choose: larger binlogs or different data in slave (if statement is really unsafe).

(But in reality there shouldn't be many statements which update too many rows.)

Swany said...

You can try using the --slave-compressed-protocol option, particularly with geographically distributed master/slave pairs.

If you are using a tunnel (like stunnel) or SSH tunnel to encrypt traffic, you might try to enable compression there instead.

Erik Ljungstrom said...

If you have any form of input/control over the applications writing to your DB, you can identify queries likely to change a large amount of data, and prepend the query with SET SESSION binlog_format = 'STATEMENT';

That way you can chose the best of both worlds.

Partha Dutta said...

Thanks for all of the comments. I'm still testing whether the compressed protocol will actually make a difference. Changing code is not really an option, since this DB is used by about 20 different applications.