突然收到MySQL报警,从库的数据库挂了,一直在不停的重启,打开错误日志,发现有张表坏了。innodb表损坏不能通过repair table 等修复myisam的命令操作。现在记录下解决过程,下次遇到就不会这么手忙脚乱了。
处理过程:
一遇到报警之后,直接打开错误日志,里面的信息:
InnoDB: Database page corruption on disk or a failedInnoDB: file read of page 30506.InnoDB: You may have to recover from a backup.130509 20:33:48 InnoDB: Page dump in ascii and hex (16384 bytes):##很多十六进制的代码…………InnoDB: End of page dump130509 20:37:34 InnoDB: Page checksum 1958578898, prior-to-4.0.14-form checksum 3765017239InnoDB: stored checksum 3904709694, prior-to-4.0.14-form stored checksum 3765017239InnoDB: Page lsn 5 614270220, low 4 bytes of lsn at page end 614270220InnoDB: Page number (if stored to page already) 30506,InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 19InnoDB: Page may be an index page where index id is 54InnoDB: (index "PRIMARY" of table "maitem"."email_status")InnoDB: Database page corruption on disk or a failedInnoDB: file read of page 30506.InnoDB: You may have to recover from a backup.InnoDB: It is also possible that your operatingInnoDB: system has corrupted its own file cacheInnoDB: and rebooting your computer removes theInnoDB: error.InnoDB: If the corrupt page is an index pageInnoDB: you can also try to fix the corruptionInnoDB: by dumping, dropping, and reimportingInnoDB: the corrupt table. You can use CHECKInnoDB: TABLE to scan your table for corruption.InnoDB: See also http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.htmlInnoDB: about forcing recovery.InnoDB: A new raw disk partition was initialized orInnoDB: innodb_force_recovery is on: we do not allowInnoDB: database modifications by the user. Shut downInnoDB: mysqld and edit my.cnf so that newraw is replacedInnoDB: with raw, and innodb_force_... is removed.130509 20:39:35 [Warning] Invalid (old?) table or database name '#sql2-19c4-5'
从错误日志里面很清楚的知道哪里出现了问题,该怎么处理。这时候数据库隔几s就重启,所以差不多可以说你是访问不了数据库的。所以马上想到要修复innodb表了。
以前在Performance的blog上看过类似文章。
当时想到的是在修复之前保证数据库正常,不是这么异常的无休止的重启。所以就修改了配置文件的一个参数:innodb_force_recovery
innodb_force_recovery影响整个InnoDB存储引擎的恢复状况。默认为0,表示当需要恢复时执行所有的innodb_force_recovery可以设置为1-6,大的数字包含前面所有数字的影响。当设置参数值大于0后,可以对表进行select,create,drop操作,但insert,update或者delete这类操作是不允许的。1(SRV_FORCE_IGNORE_CORRUPT):忽略检查到的corrupt页。2(SRV_FORCE_NO_BACKGROUND):阻止主线程的运行,如主线程需要执行full purge操作,会导致crash。3(SRV_FORCE_NO_TRX_UNDO):不执行事务回滚操作。4(SRV_FORCE_NO_IBUF_MERGE):不执行插入缓冲的合并操作。5(SRV_FORCE_NO_UNDO_LOG_SCAN):不查看重做日志,InnoDB存储引擎会将未提交的事务视为已提交。6(SRV_FORCE_NO_LOG_REDO):不执行前滚的操作。
因为错误日志里面提示出现了坏页,导致数据库崩溃,所以这里把innodb_force_recovery 设置为1,忽略检查到的坏页。重启数据库之后,正常了,没有出现上面的错误信息。找到错误信息出现的表:
(index "PRIMARY" of table "maitem"."email_status")
数据页面的主键索引(clustered key index)被损坏。这种情况和数据的二级索引(secondary indexes)被损坏相比要糟很多,因为后者可以通过使用OPTIMIZE TABLE命令来修复,但这和更难以恢复的表格目录(table dictionary)被破坏的情况来说要好一些。
操作步骤:
因为被破坏的地方只在索引的部分,所以当使用innodb_force_recovery = 1运行InnoDB时,操作如下:
执行check,repair table 都无效alter table email_status engine =myisam; #也报错了,因为模式是innodb_force_recovery =1。ERROR 1025 (HY000): Error on rename of '...' to '....' (errno: -1)
建立一张表:create table email_status_bak #和原表结构一样,只是把INNODB改成了MYISAM。把数据导进去insert into email_status_bak select * from email_status;删除掉原表:drop table email_status;注释掉innodb_force_recovery 之后,重启。重命名:rename table edm_email_status_bak to email_status;最后该回存储引擎alter table edm_email_status engine = innodb
总结:
这里的一个重要知识点就是 对 innodb_force_recovery 参数的理解了,要是遇到数据损坏甚至是其他的损坏。可能上面的方法不行了,需要尝试另一个方法:insert into tb select * from ta limit X;甚至是dump出去,再load回来。