Monday 23 June 2014

How MySQL optimizer decides which filesort algorithm to be used while processing an order by clause.

Today when i was going through the variable "max_length_for_sort_data" got to know some good information how order by internally works and which kind of algorithm it uses to process order by. It is something interesting to me, so decided to share this information with you guys also.

As of MySQL 5.6 There are 2 variations of file sort algorithm used as explained follows :

    1) The original filesort algorithm
    2) The modified filesort algorithm

There are few differences between original and modified file sort algorithms like how it retrieves and process the data from the table. For more details go through.

I would like to focus mainly on the following :
    a) The "original filesort" algorithm retrieves the rowid and the columns specified in the order by clause and put them in the sort buffer size, sort the data in the soft buffer size and again retrieve the data for other columns specified in the select statement from the table. so it will read the data from the table twice.
    b) But, when it comes to the "modified filesort" algorithm it will retrieve the rowid and all the columns specified in the select statement from the table to sort_buffer_size, then sort the data based on the columns specified in order by clause and sends the output.For more details go through.

Suppose that a table t1 has four VARCHAR columns a, b, c, and d and that the optimizer uses filesort for this query:

SELECT * FROM t1 ORDER BY a, b;

The query sorts by a and b, but returns all columns, so the columns referenced by the query are a, b, c, and d. Depending on which filesort algorithm the optimizer chooses, the query executes as follows:

For the original algorithm, sort buffer tuples have these contents: (fixed size a value, fixed size b value,row ID into t1)
The optimizer sorts on the fixed size values. After sorting, the optimizer reads the tuples in order and uses the row ID in each tuple to read rows from t1 to obtain the select list column values.

For the modified algorithm, sort buffer tuples have these contents: (fixed size a value, fixed size b value,a value, b value, c value, d value)
The optimizer sorts on the fixed size values. After sorting, the optimizer reads the tuples in order and uses the values for a, b, c, and d to obtain the select list column values without reading t1 again.

So how mysql decides which filesort algorithm to be used?

Based on the value specified by "max_length_for_sort_data", mysql decides which filesort algorithm to be used. Lets go through more detail how it will work.

For a given query optimizer first calculates the "row_size" which includes the rowid,pointers and the columns data specified in the select statement's select clause and order by clause, if the row_size > max_length_for_sort_data then optimizer uses the "original filesort" algorithm, if the row_size <= max_length_for_sort_data then optimizer uses the "modified filesort" algorithm.

The following is the detailed example which illustrates how exactly it works :

mysql> create table t1 (a varchar(10), b varchar(10), c varchar(10), d varchar(10));
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t1 values('1','1','1','1'),('2','2','2','2'),('3','3','3','3');
Query OK, 3 rows affected (0.17 sec)
Records: 3  Duplicates: 0  Warnings: 0


mysql> SET OPTIMIZER_TRACE="enabled=on",END_MARKERS_IN_JSON=on;
Query OK, 0 rows affected (0.00 sec)

mysql>  SET OPTIMIZER_TRACE_MAX_MEM_SIZE=90000000;
Query OK, 0 rows affected (0.00 sec)

CASE #1 :

mysql> set max_length_for_sort_data=128;
Query OK, 0 rows affected (0.00 sec)

NOTE : I am reducing the value of "max_length_for_sort_data" from its default (1024) to 128 for testing.

mysql>  select * from t1 order by a,b;
+------+------+------+------+
| a    | b    | c    | d    |
+------+------+------+------+
| 1    | 1    | 1    | 1    |
| 2    | 2    | 2    | 2    |
| 3    | 3    | 3    | 3    |
+------+------+------+------+

mysql> select * from information_schema.optimizer_trace\G
*************************** 1. row ***************************
                            QUERY: select * from t1 order by a,b
                            TRACE: {
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `t1`.`a` AS `a`,`t1`.`b` AS `b`,`t1`.`c` AS `c`,`t1`.`d` AS `d` from `t1` order by `t1`.`a`,`t1`.`b`"
          }
        ] /* steps */
      } /* join_preparation */
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "table_dependencies": [
              {
                "table": "`t1`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ] /* depends_on_map_bits */
              }
            ] /* table_dependencies */
          },
          {
            "rows_estimation": [
              {
                "table": "`t1`",
                "table_scan": {
                  "rows": 3,
                  "cost": 1
                } /* table_scan */
              }
            ] /* rows_estimation */
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ] /* plan_prefix */,
                "table": "`t1`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "access_type": "scan",
                      "rows": 3,
                      "cost": 1.6,
                      "chosen": true,
                      "use_tmp_table": true
                    }
                  ] /* considered_access_paths */
                } /* best_access_path */,
                "cost_for_plan": 1.6,
                "rows_for_plan": 3,
                "sort_cost": 3,
                "new_cost_for_plan": 4.6,
                "chosen": true
              }
            ] /* considered_execution_plans */
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": null,
              "attached_conditions_computation": [
              ] /* attached_conditions_computation */,
              "attached_conditions_summary": [
                {
                  "table": "`t1`",
                  "attached": null
                }
              ] /* attached_conditions_summary */
            } /* attaching_conditions_to_tables */
          },
          {
            "clause_processing": {
              "clause": "ORDER BY",
              "original_clause": "`t1`.`a`,`t1`.`b`",
              "items": [
                {
                  "item": "`t1`.`a`"
                },
                {
                  "item": "`t1`.`b`"
                }
              ] /* items */,
              "resulting_clause_is_simple": true,
              "resulting_clause": "`t1`.`a`,`t1`.`b`"
            } /* clause_processing */
          },
          {
            "refine_plan": [
              {
                "table": "`t1`",
                "access_type": "table_scan"
              }
            ] /* refine_plan */
          }
        ] /* steps */
      } /* join_optimization */
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "filesort_information": [
              {
                "direction": "asc",
                "table": "`t1`",
                "field": "a"
              },
              {
                "direction": "asc",
                "table": "`t1`",
                "field": "b"
              }
            ] /* filesort_information */,
            "filesort_priority_queue_optimization": {
              "usable": false,
              "cause": "not applicable (no LIMIT)"
            } /* filesort_priority_queue_optimization */,
            "filesort_execution": [
            ] /* filesort_execution */,
            "filesort_summary": {
              "rows": 3,
              "examined_rows": 3,
              "number_of_tmp_files": 0,
              "sort_buffer_size": 63224,
              "sort_mode": "<sort_key, rowid>"
            } /* filesort_summary */
          }
        ] /* steps */
      } /* join_execution */
    }
  ] /* steps */
}
MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0
          INSUFFICIENT_PRIVILEGES: 0
1 row in set (0.01 sec)

From the above output if we look at "filesort_information" section, "sort_mode": "<sort_key, rowid>" which indicate that the filesort algorithm used is "original filesort".

I have tried with different values for "max_length_for_sort_data" and at last it worked for me at the value of "max_length_for_sort_data=171", why because the row_lenth is 171 bytes ( it didn' worked for me even the values is 170, and it worked in all the cases if the value is >=171).

CASE #2 :

mysql> set max_length_for_sort_data=171;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t1 order by a,b;
+------+------+------+------+
| a    | b    | c    | d    |
+------+------+------+------+
| 1    | 1    | 1    | 1    |
| 2    | 2    | 2    | 2    |
| 3    | 3    | 3    | 3    |
+------+------+------+------+
3 rows in set (0.00 sec)

mysql> select * from information_schema.optimizer_trace\G                                                                                      *************************** 1. row ***************************
                            QUERY: select * from t1 order by a,b
                            TRACE: {
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `t1`.`a` AS `a`,`t1`.`b` AS `b`,`t1`.`c` AS `c`,`t1`.`d` AS `d` from `t1` order by `t1`.`a`,`t1`.`b`"
          }
        ] /* steps */
      } /* join_preparation */
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "table_dependencies": [
              {
                "table": "`t1`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ] /* depends_on_map_bits */
              }
            ] /* table_dependencies */
          },
          {
            "rows_estimation": [
              {
                "table": "`t1`",
                "table_scan": {
                  "rows": 3,
                  "cost": 1
                } /* table_scan */
              }
            ] /* rows_estimation */
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ] /* plan_prefix */,
                "table": "`t1`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "access_type": "scan",
                      "rows": 3,
                      "cost": 1.6,
                      "chosen": true,
                      "use_tmp_table": true
                    }
                  ] /* considered_access_paths */
                } /* best_access_path */,
                "cost_for_plan": 1.6,
                "rows_for_plan": 3,
                "sort_cost": 3,
                "new_cost_for_plan": 4.6,
                "chosen": true
              }
            ] /* considered_execution_plans */
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": null,
              "attached_conditions_computation": [
              ] /* attached_conditions_computation */,
              "attached_conditions_summary": [
                {
                  "table": "`t1`",
                  "attached": null
                }
              ] /* attached_conditions_summary */
            } /* attaching_conditions_to_tables */
          },
          {
            "clause_processing": {
              "clause": "ORDER BY",
              "original_clause": "`t1`.`a`,`t1`.`b`",
              "items": [
                {
                  "item": "`t1`.`a`"
                },
                {
                  "item": "`t1`.`b`"
                }
              ] /* items */,
              "resulting_clause_is_simple": true,
              "resulting_clause": "`t1`.`a`,`t1`.`b`"
            } /* clause_processing */
          },
          {
            "refine_plan": [
              {
                "table": "`t1`",
                "access_type": "table_scan"
              }
            ] /* refine_plan */
          }
        ] /* steps */
      } /* join_optimization */
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
          {
            "filesort_information": [
              {
                "direction": "asc",
                "table": "`t1`",
                "field": "a"
              },
              {
                "direction": "asc",
                "table": "`t1`",
                "field": "b"
              }
            ] /* filesort_information */,
            "filesort_priority_queue_optimization": {
              "usable": false,
              "cause": "not applicable (no LIMIT)"
            } /* filesort_priority_queue_optimization */,
            "filesort_execution": [
            ] /* filesort_execution */,
            "filesort_summary": {
              "rows": 3,
              "examined_rows": 3,
              "number_of_tmp_files": 0,
              "sort_buffer_size": 202091,
              "sort_mode": "<sort_key, additional_fields>"
            } /* filesort_summary */
          }
        ] /* steps */
      } /* join_execution */
    }
  ] /* steps */
}
MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0
          INSUFFICIENT_PRIVILEGES: 0
1 row in set (0.01 sec)

From the above output if we look at "filesort_information" section, "sort_mode": "<sort_key, additional_fields>" which indicate that the filesort algorithm used is "modidied filesort" and also observed that sort buffer size used is high (202091) in case of "modified filesort" algorithm and it is low (63224) in case of "original filesort" algorithm, it is because of in case of "modified filesort" algorithm it keeps all select columns data in the sort buffer.

CONCLUSION : For better execution of order by queries we may need to increase this value to some extent based on requirements and before change this value do some testing and then only proceed. Most of the cases all the columns specified in the select statement’s select clause might not be specified in the order by clause, if that is the case leaving this value to the default might give the better performance than changing this value.  

For information about the optimizer trace, see MySQL Internals: Tracing the Optimizer


MySQL InternalsMySQL HighAvailabilityMySQL Performance TuningMySQL Query OptimizationMySQL performanceMySQL FabricMySQL HAMySQL InstallationMySQL UpgradeInnoDB Performance TuningInnoDB Buffer Pool SizeMySQL Performance TuningMySQL ClusterMySQL Latest NewsNews and EventsMySQL Customers

No comments:

Post a Comment