Saturday 13 September 2014

Variables should be taken care while using InnoDB Persistent Statistics

As part of this post i would like to share more information about how the InnoDB Persistent Statistics works, As a DBA what are the other variables that we should take care of while using Persistent Statistics, with a real time problem i came across and the solution to it.

PROBLEM : Query execution taking around 80% more time on MySQL 5.6 compared to  5.1.

REASON : Recently we have upgraded one legacy DB instance from MySQL 5.1 to 5.6 (Done in 2 phases 5.1 -> 5.5 -> 5.6). We have taken care most of the variables tuning required as part of upgrade. Later we came to know that certain queries are taking around 80% more time than usual, started digging into the issue and later found that its because of "InnoDB Persistent Statistics".

Before go through in detail about the problem, first I would like to go through how exactly InnoDB Persistent Statistics works.

InnoDB has always computed statistics for each InnoDB table to help the optimizer find the most efficient query execution plan. Now you can make these statistics persistent, so that the index usage and join order for a particular query is less likely to change. This feature is on by default, enabled by the configuration option innodb_stats_persistent.

if "innodb_stats_persistent" not eanbled then, InnoDB statistics are recalculated at any of the following events:

* When the table is opened

* When the table has changed a lot (10% of the table has been updated/deleted or inserted)

* When ANALYZE TABLE is run

* When SHOW TABLE STATUS or SHOW INDEX FROM is run

* When InnoDB Monitor is turned ON

But if "innodb_stats_persistent" enable, Does not recalculate the statistics automatically, only when ANALYZE TABLE is run. This means that the same query plan will always be chosen even when MySQL is restarted. A query plan using the global statistics can only change after ANALYZE TABLE has been run (manually). This also means that the user and/or DBA is responsible for running ANALYZE TABLE regularly if the data in the table has been changed a lot or if the table data changes more than certain amount of the data when the "innodb_stats_auto_recalc" is ON.

if "innodb_stats_persistent" is enabled then the following 2 variables plays a major role and every DBA must be aware of.

1) innodb_stats_persistent_sample_pages : The number of index pages to sample when estimating cardinality and other statistics for an indexed column, such as those calculated by ANALYZE TABLE. Increasing the value improves the accuracy of index statistics, which can improve the query execution plan, and it will take more time ( if the no of pages increases then time to complete ANALYZE TABLE also increases, its directly proportional, and it may also cause MySQL to do more IO operations).

2) innodb_stats_auto_recalc : Causes InnoDB to automatically recalculate persistent statistics after the data in a table is changed substantially. The threshold value is currently 10% of the rows in the table. This setting applies to tables created when the innodb_stats_persistent option is enabled, or where the clause STATS_PERSISTENT=1 is enabled by a CREATE TABLE or ALTER TABLE statement.

For bigger tables with more no of indexes "innodb_stats_persistent_sample_pages=20" the default value may not give better execution plans, in that case by increasing this value may produce better query execution plans. But be aware that it will take more time to calculate Statistics and it should run manually when the table data changed more that 10% as mentioned above.

SOLUTION : The problem is that the index statistics are not good enough to generate better query execution plan, that’s the reason some of the queries which comprises bigger tables were taking more that 80% time than usual. Though we ran optimize table at database level as part of upgrade activity to de-fragment tables & for better index Statistics it didn't help. Later for bigger tables we ran optimize with various combinations of values for "innodb_stats_persistent_sample_pages" and at last found that its because of this variable and came up with good number of pages based on tables size, at last the query execution time backup to normal and bit better than 5.1.

For more information go through : http://dev.mysql.com/doc/refman/5.6/en/innodb-persistent-stats.html

MySQL InternalsMySQL HighAvailabilityMySQL Performance TuningMySQL Query OptimizationMySQL performanceMySQL FabricMySQL HAMySQL InstallationMySQL UpgradeInnoDB Performance TuningInnoDB Buffer Pool SizeMySQL Performance TuningMySQL ClusterMySQL Latest NewsNews and EventsMySQL Customers