选择合适的innodb_log_file_size

Choosing proper innodb_log_file_size

选择合适的innodb_log_file_size

If you’re doing significant amount of writes to Innodb tables decent size of innodb_log_file_size is important for MySQL Performance. However setting it too large will increase recovery time, so in case of MySQL crash or power failure it may take long time before MySQL Server is operational again.

如果对 Innodb 数据表有大量的写入操作,那么选择合适的 innodb_log_file_size 值对提升MySQL性能很重要。然而设置太大了,就会增加恢复的时间,因此在MySQL崩溃或者突然断电等情况会令MySQL服务器花很长时间来恢复。

So how to find the optimal combination ?

那么,怎么才能找到最佳的配置组合呢?

First let me explain what happens on recovery and why large innodb_log_file_size slows down recovery. During startup after crash Innodb scans log files to find log records which only have been applied in memory and do not exist in tablespace. Log records for modifications which did not make it to the tablespace are then applied. This is called redo phase of recovery. It can take pretty long time and this time depends on number of variables - how large are rows ? (smaller log records mean more records for same sized logs), how random were data modifications (random updates will need random IO to check if pages are up to date), number of unflushed pages in innodb buffer pool and its size as well as performance of IO subsystem. As there are so many factors, it is hard to come up with any general guidelines, something like 1GB per 10 minutes of recovery time - instead you would need to apply load which is typical for your application, crash MySQL in the middle and watch it to recover. Doing this several times you should be able to estimate how long recovery time take and adjust your logs apropriately. The good thing is - redo phase is close to be proportional to size of log files, so expect 1GB logs to take twice time to apply compared to 512MB logs.

首先,让我先来解释一下恢复时都发生了什么事情以及为什么设置 innodb_log_file_size 的值太大了会让恢复过程变慢。Innodb 数据表崩溃再次启动时,MySQL 会扫描日志文件来找到那个只应用到内存中并且不存在的表空间的日志记录。那些没有没有放到表空间的修改日志记录就要被加进去。这叫做重做相位恢复。这需要相当长时间,它取决于变量的值 -- 到底有多少行记录?(日志记录的值越小意味着同样大小的日志里可以存储更多的记录),随机数据修改的几率有多高(随机更新需要有更多的随机IO来检查内存页是否更新),innodb 缓冲池中未被刷新的内存页数量并且它也是IO子系统的性能表现。由于有这么多因素,就很难产生通用的准绳,例如每10分钟恢复1GB数据的时长 -- 相反地,应该在典型的应用中来确定负载,在MySQL崩溃的过程中来监查它是怎么恢复的。这么做几次之后,你就应该能大致估算恢复所需的时间了从而更恰当地调整日志大小。好事是 -- 重做相位和日志文件大小成正比,因此预计恢复1GB的日志所需的时间大致是512MB的2倍。

Redo phase is however only one of the phases of recovery. The other important one is undophase - after log file are applied and database is in “physically consistent” state, Innodb will need to roll back certain transactions which Where not commited, but changes from which already made it to the database. Unlike “redo” phase “undo” phase can’t be reduced by sizing your log files. Even more undo phase can be slower with small log files. Undo phase takes considerable time if tranactions are long - Ie if you would delete 10000000 rows in the same transaction and crash in the middle recovery can take quite a long time. The only way you can reduce “undo” phase is size your transactions appropriately - so updates/inserts/deletes can be sized to affect limited number of rows.

然而重做相位是相位恢复的唯一方法。另一个重要的方法是撤销相位 -- 当日志文件应用完之后并且数据库处于 "物理一致性" 状态时,Innodb 会回滚那些没提交的事务,但是已经对数据库所做的修改就不管了。不像 "重做" 相位,"撤销" 相位不会因为日志尺寸变小而变快。甚至撤销相位还可能因为日志较小而变慢。撤销相位所耗时间因事务长短所致 -- 例如,如果需要在一个事务中删除 10000000 行记录,这个事务中途发生错误崩溃了,那么恢复就需要花很长时间了。唯一能减少 "撤销" 相位的方法是设置适当的日志大小值 -- 这样的话,记录更新/插入/删除时就会被限定在有限的数量里了。

Good thing about updo phase however is - it can be done in background as in MySQL 5.0. The rows affected by background rollback however might not be modified until rollback is complete.

不过撤销相位的好处是 -- 在MySQL 5.0中,它可以让在后台来执行。后台回滚的记录直至恢复完之后才能被修改。

One more thing to consider - how large log files do you need at all ? You could run benchmark with 1GB log files and 2GB and see if there is any performance benefit. After certain size increasing log file size do not dramatically increase performance, however this again depends on configuration and workload.

另一个要考虑的事是 -- 到底需要多大的日志?可以运行基准测试来检查 1GB 大小的日志相对 2GB 有什么好处。日志文件增加到一定大小后未必会戏剧性地提高性能,然而这同样依赖于配置以及MySQL的工作负载。

Note at this poing 4GB is maximum combined size allowed for innodb log files, which is however large enough limit for most onfigurations.

注意,这里举例中的 4GB 是 innodb 日志文件的最大值,不过它明显比常用的配置大得多了。