DBAs’ Inconceivable Tales: A Rare Cause of Replication Lag

Our MySQL support engineers encountered a rather unusual cause of replication lag. We covered the most common causes of replication lag in our previous article about MySQL parallel replication. In this particular case, the root cause was more strange. The servers were very high spec with NVMe disks and 10Gbit Ethernet. Analysis showed that the bottleneck was neither disk I/O nor slow, long running transactions.

Parallel replication was in place, and that workload on this system fits the feature very well (many thousands of fast transactions per second), it fits the parallel replication feature very well. But there was never enough relay log available to saturate more than one SQL thread.

The problem manifested in a way that would normally imply a severe network bottleneck, but network bandwidth tested at close to a gigabyte per second. Yet the relay log would always rapidly get far behind the master’s binlog position and keep falling further behind. It looks like the problem is not new, and we are not the first to encounter this problem – but we may be the first to have come up with a generally applicable solution.

Slow Replication Over a Fast Network

Testing binlog transfer performance over MySQL protocol using the mysqlbinlog utility showed orders of magnitude faster speeds. Therefore, the bottleneck is in the way that MySQL I/O thread on the slave pulls the master’s binlog into the local relay log. We also verified this by stopping the slave sql_thread while leaving the io_thread running, so relay log would buffer up. When we restarted the sql_thread, the lag would rapidly reduce – right up to the point where the sql_thread caught up with the relay log.

Switching to semi-synchronous helped keep replication lag under control, but it slowed the master down to the point where it couldn’t get even close to keep up with the required workload.

Findings Summary

To summarise our findings:

  • The cause of lag is that relay logs are not being filled fast enough to keep up with generation of binary logs upstream
  • Network is not the bottleneck
  • MySQL protocol as utilised by mysqlbinlog to pull binary logs is not the bottleneck

Other than figuring out the root cause of this performance bug in MySQL 8.0 (also observed recently in MariaDB), what could we do as a relatively quick fix to work around the problem?

Coming up With a Solution

This excellent article on fast point-in-time recovery came to mind. We already established that binlogs transfer fast using scp and mysqlbinlog. If we fetch the binary logs into relay logs using a faster method, then we could sideload them as relay logs using the same method described in the point in time recovery article.

Tooling Up

Of course we can’t be doing this manually all the time. We had to write a tool to continuously sideload relay logs. It will be part of what will become the “Silicon Toolkit”, so we called it st-sideload-relay and in the meantime you can get it directly from our git repository. It is designed to “just work” – you run it as root with no parameters. If you have socket authentication without password for root@localhost, it will figure out the replication username and password from the master.info file or mysql.slave_master_info table. If you need authentication, you can put it in ~/.my.cnf and st-sideload-relay will figure it out from there. The only thing you will need to change is grant the replication user the REPLICATION CLIENT privilege, in addition to the REPLICATION SLAVE privilege because st-sideload-relay needs to be able to get the available binary logs upstream and their size.

Overall Effect

Just how much difference does this make? A lot! Here are the replication lag graphs of what happens when we turn off st-sideload-relay and switch back to purely built in replication, and how quickly it recovers when we turn on st-sideload-relay again. The boost is quite staggering.

st-sideload-relay disabled at the bottom of the sawtooth pattern and enabled at the top.

It is important to stress that the benefits of this are not universal. This tool will ONLY help if the bottleneck is in transferring the upstream binary log to the relay log, and NOT caused by slow network. If your bottleneck is somewhere else, this tool won’t help you. If you have such a problem, get in touch with our MySQL consultants. Most replication problems have solutions which we can advise you on.