Mysql Query Cache学习篇

langzi_xl

浏览: 22459 次
性别:
来自: 杭州

最近访客更多访客>>

liangzi_001

shaffer

yzzh9

毅毅Joey

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Other

Cache MySQL SQL PHP 数据结构

基础介绍篇：

QueryCache是用来缓存select语句结果集的一种机制，不是缓存execution plan。需要注意：对select语句的大小写空格敏感。

用张形象的图可以展现下(图是从网上“借”的)：

无实验，无真像，我们还是动手理解下吧

先看下当前环境状态

mysql> select version();
+————–+
| version()    |
+————–+
| 5.1.45-debug |
+————–+
1 row in set (0.00 sec)

mysql> show status like ‘%qcache%’;
+————————-+———-+
| Variable_name           | Value    |
+————————-+———-+
| Qcache_free_blocks      | 1        |
| Qcache_free_memory      | 67099584 |
| Qcache_hits             | 0        |
| Qcache_inserts          | 0        |
| Qcache_lowmem_prunes    | 0        |
| Qcache_not_cached       | 2        |
| Qcache_queries_in_cache | 0        |
| Qcache_total_blocks     | 1        |
+————————-+———-+
8 rows in set (0.00 sec)

看看Query Cache的参数设置，已经打开query cache了(ON)

mysql> show variables like ‘%query_cache%’;
+——————————+———-+
| Variable_name                | Value    |
+——————————+———-+
| have_query_cache             | YES      |
| query_cache_limit            | 1048576 |
| query_cache_min_res_unit     | 2048     |
| query_cache_size             | 67108864 |
| query_cache_type             | ON       |
| query_cache_wlock_invalidate | OFF      |
+——————————+———-+
6 rows in set (0.00 sec)

这是一个有一百万行记录的测试表：

mysql> select count(distinct(pad)) from sbtest;
+———————-+
| count(distinct(pad)) |
+———————-+
| 1 |
+———————-+
1 row in set (4.57 sec)

再看下状态，由于是第一次执行本select语句，自然没在cache中，所以是insert进去没hit到：
mysql> show status like ‘%qcache%’;
+————————-+———-+
| Variable_name           | Value    |
+————————-+———-+
| Qcache_free_blocks      | 1        |
| Qcache_free_memory      | 67098048 |
| Qcache_hits             | 0        |
| Qcache_inserts          | 1        |
| Qcache_lowmem_prunes    | 0        |
| Qcache_not_cached       | 2        |
| Qcache_queries_in_cache | 1        |
| Qcache_total_blocks     | 4        |
+————————-+———-+
8 rows in set (0.00 sec)

再执行一下同样的语句看看效果，执行时间真快呀
mysql> select count(distinct(pad)) from sbtest;
+———————-+
| count(distinct(pad)) |
+———————-+
| 1 |
+———————-+
1 row in set (0.00 sec)

这下子hit到了^-^
mysql> show status like ‘%qcache%’;
+————————-+———-+
| Variable_name           | Value    |
+————————-+———-+
| Qcache_free_blocks      | 1        |
| Qcache_free_memory      | 67098048 |
| Qcache_hits             | 1        |
| Qcache_inserts          | 1        |
| Qcache_lowmem_prunes    | 0        |
| Qcache_not_cached       | 2        |
| Qcache_queries_in_cache | 1        |
| Qcache_total_blocks     | 4        |
+————————-+———-+
8 rows in set (0.00 sec)

如果语句大小写不一，还得insert没hit到，呵呵
mysql> select COUNT(distinct(pad)) from sbtest;
+———————-+
| COUNT(distinct(pad)) |
+———————-+
|                    1 |
+———————-+

1 row in set (1.99 sec)
mysql> show status like ‘%qcache%’;
+————————-+———-+
| Variable_name           | Value    |
+————————-+———-+
| Qcache_free_blocks      | 1        |
| Qcache_free_memory      | 67097024 |
| Qcache_hits             | 1        |
| Qcache_inserts          | 2        |
| Qcache_lowmem_prunes    | 0        |
| Qcache_not_cached       | 2        |
| Qcache_queries_in_cache | 2        |
| Qcache_total_blocks     | 6        |
+————————-+———-+
8 rows in set (0.00 sec)

如果sbtest表被update/insert/delete等语句操作后，这个select语句的query cache就会被失效。所以如果表被频繁更新，query cache的overhead就会被放大出来。一般的理论是默认打开QC，对一些涉及频繁更新的表的SQL语句可以加上SQL_NO_CACHE这样的 sql hints来禁用query cache。

顺便提下官方文档里一个有意思的benchmark测试结果数据：

1. 从表中查出某一条特定记录这样的简单SQL，如果保证每个SQL不同，这样子没利用query cache机制，最终显示query cache的overhead是13%

2. 如果保证上述那些SQL一样，利用到了query cache机制的话，结果显示会加速238%

源码学习篇：

其实涉及的主要的数据结构就是个hash表，key是query + database + flag

这个flag很庞大的其实:

struct Query_cache_query_flags
{
unsigned int client_long_flag:1;
unsigned int client_protocol_41:1;
unsigned int result_in_binary_protocol:1;
unsigned int more_results_exists:1;
unsigned int in_trans:1;
unsigned int autocommit:1;
unsigned int pkt_nr;
uint character_set_client_num;
uint character_set_results_num;
uint collation_connection_num;
ha_rows limit;
Time_zone *time_zone;
ulong sql_mode;
ulong max_sort_length;
ulong group_concat_max_len;
ulong default_week_format;
ulong div_precision_increment;
MY_LOCALE *lc_time_names;
};

一个正常的select QUERY流程很长很长，可以从sql_parse.cc的dispatch_command开始看，当执行到里面的mysql_parse时我们就会看到里面开始涉及query cache了：

lex_start(thd);
mysql_reset_thd_for_next_command(thd);

if (query_cache_send_result_to_client(thd, (char*) inBuf, length) <= 0)

这个query_cache_send_result_to_client很关键，她会判断当前的QUERY能不能从cache中找到结果集，如果符合条件找到的话，就会略过parse_sql、mysql_execute_command也就是说不需要再经过LEX YACC等词语法解析了，也不需要经过语句执行时和存储引擎打交道，避免掉操作系统层和disk之间的IO了。如果找不到，就会解析执行，并且如果结果集小于query_cache_limit参数值时会放到cache中（具体详见Query_cache::store_query函数）

那个hash查找是这样：

query_block = (Query_cache_block *) hash_search(&queries, (uchar*) sql,
tot_length);
至于具体的与query_cache_min_res_unit参数关联很大的cache内存分配策略、update/insert/delete时invalidate cache等等，我们都可以通过单步debug来学习，这里就不一一详述了：）

一个延伸的问题：

分享个有意思的Bug #40986 Leading line comments prevent Query Cache from caching result.

作为QA，我经常喜欢YY些测试用例，这个BUG其实test case很简单，但我第一次看到时还是有些震撼，从测试的角度。

我在想，如果我测试query cache这个feature，我在设计测试用例时会想到各种注释开头的select语句么？？？？？

参考资料：

http://wenku.baidu.com/view/14fc9ec3d5bbfd0a7956733f.html 来自Baron Schwartz

http://dev.mysql.com/doc/refman/5.0/en/query-cache.html 官方文档

http://www.surfchen.org/wiki/MySQL%E4%BC%98%E5%8C%96

http://bugs.mysql.com/bug.php?id=40986

分享到：