线上出现的慢SQL:
SELECT c.id FROM fnbl_pim_contact AS c WHERE c.fuserid = 6246640 and (first_name !='' or middle_name is not null and middle_name !='' or last_name !='') AND c.status in ('N','U') order by id limit 1;
需要十多秒甚至更长时间才能出来结果。
这个表的大小在几千万的级别。
mysql> desc SELECT c.id FROM fnbl_pim_contact AS c WHERE c.fuserid = 6246640 and (first_name !='' or middle_name is not null and middle_name !='' or last_name !='') AND c.status in ('N','U') order by id limit 1;
+----+-------------+-------+-------+----------------------------------------------------------------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+----------------------------------------------------------------------+------+---------+------+------+-------------+
| 1 | SIMPLE | c | index | IDX_FUSERID_STATUS_ISDEL_ISPROFILE,IDX_FUSERID_ID,IDX_FUSERID_UPDATE | id | 8 | NULL | 696 | Using where |
+----+-------------+-------+-------+----------------------------------------------------------------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
从执行计划来看,看似索引页走了,扫描的行数也很少,不应该慢才对。但是需要注意到一点就是计划中显示 type为 index,这是索引扫描,这是仅次于ALL全表扫描的耗时操作。
这里为何要选择索引扫描呢,而且走的索引是主键。
再看看这个表的索引。
mysql> show index from fnbl_pim_contact;
+------------------+------------+------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------+------------+------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| fnbl_pim_contact | 0 | PRIMARY | 1 | userid | A | 361576 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 0 | PRIMARY | 2 | id | A | 28202968 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 1 | id | 1 | id | A | 28202968 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 1 | IDX_CONTACT_STATUS_ISDEL | 1 | userid | A | 239008 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 1 | IDX_CONTACT_STATUS_ISDEL | 2 | status | A | 575570 | NULL | NULL | YES | BTREE | | |
| fnbl_pim_contact | 1 | IDX_CONTACT_STATUS_ISDEL | 3 | isdel | A | 575570 | NULL | NULL | YES | BTREE | | |
| fnbl_pim_contact | 1 | IDX_CONTACT_STATUS_ISDEL | 4 | isprofile | A | 600063 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 1 | IDX_CONTACT_USERID_UPDATE | 1 | userid | A | 408738 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 1 | IDX_CONTACT_USERID_UPDATE | 2 | last_update | A | 14101484 | NULL | NULL | YES | BTREE | | |
| fnbl_pim_contact | 1 | IDX_CONTACT_KEY_ID | 1 | id | A | 28202968 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 1 | IDX_FUSERID_STATUS_ISDEL_ISPROFILE | 1 | FUSERID | A | 220335 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 1 | IDX_FUSERID_STATUS_ISDEL_ISPROFILE | 2 | status | A | 552999 | NULL | NULL | YES | BTREE | | |
| fnbl_pim_contact | 1 | IDX_FUSERID_STATUS_ISDEL_ISPROFILE | 3 | isdel | A | 552999 | NULL | NULL | YES | BTREE | | |
| fnbl_pim_contact | 1 | IDX_FUSERID_STATUS_ISDEL_ISPROFILE | 4 | isprofile | A | 626732 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 1 | IDX_FUSERID_ID | 1 | FUSERID | A | 339794 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 1 | IDX_FUSERID_ID | 2 | id | A | 28202968 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 1 | IDX_FUSERID_UPDATE | 1 | FUSERID | A | 231171 | NULL | NULL | | BTREE | | |
| fnbl_pim_contact | 1 | IDX_FUSERID_UPDATE | 2 | last_update | A | 7050742 | NULL | NULL | YES | BTREE | | |
+------------------+------------+------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
这么多索引,为何选择主键,这是语句中有order by id。优化器认为排序是比较耗时的操作,而走索引是不会有额外的操作,因此选择了主键。
去掉order by id来验证猜想:
mysql> desc SELECT c.id FROM fnbl_pim_contact AS c WHERE c.fuserid = 6246640 and (first_name !='' or middle_name is not null and middle_name !='' or last_name !='') AND c.status in ('N','U') limit 1;
+----+-------------+-------+------+----------------------------------------------------------------------+--------------------+---------+-------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------------------------------------------------+--------------------+---------+-------+-------+-------------+
| 1 | SIMPLE | c | ref | IDX_FUSERID_STATUS_ISDEL_ISPROFILE,IDX_FUSERID_ID,IDX_FUSERID_UPDATE | IDX_FUSERID_UPDATE | 8 | const | 40502 | Using where |
+----+-------------+-------+------+----------------------------------------------------------------------+--------------------+---------+-------+-------+-------------+
1 row in set (0.00 sec)
mysql> SELECT c.id FROM fnbl_pim_contact AS c WHERE c.fuserid = 6246640 and (first_name !='' or middle_name is not null and middle_name !='' or last_name !='') AND c.status in ('N','U') limit 1;
+---------+
| id |
+---------+
| 3778623 |
+---------+
1 row in set (0.00 sec)
从上面结果看来,猜测是正确的。但是去掉order by id不合符业务,所以不能简单的去掉order by id来处理。
我们知道当where条件和order by字段在同一个索引时,order by也是不会有额外的排序操作的。
我们看到,表中还有个索引为IDX_FUSERID_ID(fuserid,id),从语句where 条件和order by来看,选择这条索引是比较好的。强制走IDX_FUSERID_ID 索引看看:
mysql> desc SELECT c.id FROM fnbl_pim_contact AS c force index(IDX_FUSERID_ID) WHERE c.fuserid = 6246640 and (first_name !='' or middle_name is not null and middle_name !='' or last_name !='') AND c.status in ('N','U') order by id limit 1;
+----+-------------+-------+------+----------------+----------------+---------+-------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------+----------------+---------+-------+-------+-------------+
| 1 | SIMPLE | c | ref | IDX_FUSERID_ID | IDX_FUSERID_ID | 8 | const | 45412 | Using where |
+----+-------------+-------+------+----------------+----------------+---------+-------+-------+-------------+
1 row in set (0.00 sec)
可以看到type由 index改变为ref,这是非唯一性索引访问,比索引扫描快很多。
mysql> SELECT c.id FROM fnbl_pim_contact AS c force index(IDX_FUSERID_ID) WHERE c.fuserid = 6246640 and (first_name !='' or middle_name is not null and middle_name !='' or last_name !='') AND c.status in ('N','U') order by id limit 1;
+---------+
| id |
+---------+
| 3673097 |
+---------+
1 row in set (0.00 sec)
同时可以看到执行时间基本没有。
从这个例子可以看出,优化器在索引选择上有时也会出现误差。分析一条慢SQL,要从多个角度去看,而不能只看已个方面,从上面的执行计划中,rows 很小,给人的感觉就是扫描的行数
都这么少,怎么会慢呢。但是同时不能忘了 扫描方式为index这种耗时的操作。因此优化可以从这个角度去考虑。
借用一个阿里DBA的公式来讲解他在做sql优化的时候遵循的原则:
T=S/V(T代表时间,S代表路程,V代表速度)
S指SQL所需访问的资源总量,V指SQL单位时间所能访问的资源量,T自然就是SQL执行所需时间了;我们为了获得SQL最快的执行时间,可以根据公式定义上去反推:
在S不变的情况下,我们可以提升V来降低T:通过适当的索引调整,我们可以将大量的速度较慢的随机IO转换为速度较快的顺序IO;通过提升服务器的内存,使得将更多的数据放到内存中,会比数据放到磁盘上会得到明显的速度提升;采用电子存储介质进行数据存储和读取的SSD,突破了传统机械硬盘的性能瓶颈,使其拥有极高的存储性能;在提升V上我们可以采用较高配置的硬件来完成速度的提升;
在V不变的情况下,我们可以减小S来降低T:这是SQL优化中非常核心的一个环节,在减小S环节上,DBA可以做的可以有很多,通常可以在查询条件中建立适当的索引,来避免全表扫描;有时候可以改写SQl,
添加一些适当的提示符,来改变SQL的执行计划,使SQL以最少的扫描路径完成查询;当这些方法都使用完了之后,你是否还有其他方案来优化喃?
还有就是要深入的了解业务,当DBA深入的了解业务之后,这个时候能站在业务上,又站DB角度上考虑,这个时候在去做优化,有时候能达到事半功倍的效果。
其实我们这里参考的就是V不变,缩小S。从而达到降低T.
http://tieba.baidu.com/p/2638469262