您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 质量控制/管理 > MySQL查询优化实践
MySQL查询调优实践DavidJiangWeibo:insidemysqlGtalk:jiangchengyao@gmail.comDTCC2012DTCC2012主题•B+树索引•简单查询优化–OLTP•复杂查询优化–OLAPDTCC2012DTCC2012关于我•近10年MySQL数据库使用经验–MySQL3.23~MySQL5.6•InnoSQL分支版本创始人–•独立数据库咨询顾问–•《MySQL技术内幕》系列作者–《MySQL技术内幕:InnoDB存储引擎》(已出版)–《MySQL技术内幕:SQL编程》(已出版)–《MySQL技术内幕:性能调优》(待出版)DTCC2012DTCC2012B+树索引•常用的索引–B+TreeIndex–TTreeIndex–HashIndex•什么是索引?–提高查询速度???–Itdepends–减少IO次数DTCC2012DTCC2012B+树索引•聚集索引(ClusteredIndex)–叶子节点存放整行记录•辅助索引(SecondaryIndex)–叶子节点存放rowidentifier•InnoDB:primarykey–书签查找(bookmarklookup)»查找代价大•MyISAM:物理位置(偏移量)–更新代价大•B+树的高度=IO次数=随机IO–3~4层DTCC2012DTCC2012B+树索引-InnoDB存储引擎DTCC2012DTCC2012B+树索引-MyISAM存储引擎DTCC2012DTCC2012B+树索引•聚集索引VS辅助索引–Clusteredindexkey=4bytes–Secondaryindexkey=4bytes–Keypointer=6bytes–Averagerowlength=300bytes–Pagesize=16K=16384bytes–Averagenodeoccupancy=70%(bothforleafandindexpage)–Fan-outforclusteredindex=16384*70%/(4+6)=1000–Fan-outforsecondaryindex=16384*70%/(4+6)=1000–Averagerowperpageforclusteredindex=16384*70%/300=35–Averagerowperpageforclusteredindex=16384*70%/(4+6)=1000HClusteredIndexSecondaryIndex21000*35=35,0001000*1000=1,000,0003(1000)2*35=35,000,000(1000)2*1000=1,000,000,0004(1000)3*35=35,000,000,000(1000)3*1000=1,000,000,000,000DTCC2012DTCC2012B+树索引•辅助索引的优势–树的高度较小=需要的IO次数少–树的大小较小=scan需要扫描的页较少–优化器倾向于使用辅助索引•辅助索引的劣势–查找完整记录还需查询•InnoDB:查询聚集索引•MyISAM:直接查找MYD物理位置DTCC2012DTCC2012B+树索引-InnoDB索引CREATETABLEUserInfo(useridINTNOTNULLAUTO_INCREMENT,usernameVARCHAR(30),registdateDATETIME,emailVARCHAR(50),PRIMARYKEY(userid),UNIQUEKEYidx_username(username),KEYidx_registdate(registdate))Engine=InnoDB;CREATETABLEUserInfo(useridINTNOTNULLAUTO_INCREMENT,usernameVARCHAR(30),registdateDATETIME,emailVARCHAR(50),PRIMARYKEY(userid));CREATETABLEidx_username(useridINTNOTNULL,usernameVARCHAR(30),PRIMARYKEY(username,userid));CREATETABLEidx_registdate(useridINTNOTNULL,registdateDATETIME),PRIMARYKEY(registdate,userid));DTCC2012DTCC2012B+树索引-InnoDB索引DTCC2012DTCC2012B+树索引-插入顺序问题•聚集索引插入–主键是自增长的–插入是顺序的–每页中的填充率高(15/16)–顺序扫描(Scan)可以达到磁盘顺序读的速率–一般不推荐使用UUID•插入非顺序DTCC2012DTCC2012B+树索引-插入顺序问题•辅助索引插入–插入的顺序是乱序的•插入(’David’,‘Monty’,‘Jimmy’,‘Amy’,‘Michael’)–插入的顺序是顺序的•插入时间–需要产生B+树的分裂•需要较大的开销–每页的填充率较低(60%~70%)–顺序扫描不能达到磁盘顺序读的速率•若插入是乱序的DTCC2012DTCC2012B+树索引-插入顺序问题•辅助索引顺序扫描速度–selectcount(1)fromstock•KEY`fkey_stock_2`(`s_i_id`),INT•Avgrowlength:355辅助索引:6分42秒•~4M/秒•Logical_reads:12700001Physical_reads:100057强制聚集索引:4分38秒•~120~130M/秒•Logical_reads:14670405Physical_reads:2170333DTCC2012DTCC2012简单查询优化•简单查询–OLTP•简单查询特点–SQL语句较为简单–返回少部分数据–并发量大•优化原则–减少随机读=使用索引•HighCardinalityDTCC2012DTCC2012简单查询优化•SELECT…FROMtablewhereprimary_key=???•SELECT…FROMtablewherekey=???DTCC2012DTCC2012简单查询优化-复合索引•索引键值为多个列–(a,b)a:1,1,2,2,3,3(a,b):(1,1),(1,2),(2,1),(2,4),(3,1),(3,2)b:1,2,1,4,1,2DTCC2012DTCC2012简单查询优化-复合索引•复合索引(a,b)可被使用于:•SELECT*FROMtWHEREa=?•SELECT*FROMtWHEREa=?ANDb=?•SELECT*FROMtWHEREa=?ORDERBYb•索引覆盖–查询b也可以使用该索引–WHEREb=???DTCC2012DTCC2012简单查询优化-索引覆盖•从辅助索引直接得到结果–不需要书签查找•(primarykey1,primarykey2,…,key1,key2,…)–SELECTkey2FROMtableWHEREkey1=xxx;–SELECTprimarykey2,key2FROMtableWHEREkey1=xxx;–SELECTprimarykey1,key2FROMtableWHEREkey1=xxx;–SELECTprimarykey1,primarykey2,key2FROMtableWHEREkey1=xxx;DTCC2012DTCC2012简单查询优化-索引覆盖CREATETABLEItemLog(logIdINTNOTNULLAUTO_INCREMENT,userIdVARCHAR(100)NOTNULL,itemIdINTNOTNULL,dateDATETIMENOTNULL,……PRIMARYKEY(logId),KEYidx_userId_date(userId,date))ENGINE=INNODB;SELECTCOUNT(1)FROMItemLogWHEREdate='2012-04-01'ANDdate'2012-05-01';DTCC2012DTCC2012简单查询优化-书签查找优化CREATETABLEUserInfo(useridINTNOTNULLAUTO_INCREMENT,usernameVARCHAR(30),registdateDATETIME,emailVARCHAR(50),PRIMARYKEY(userid),UNIQUEKEYidx_username(username),KEYidx_registdate(registdate))Engine=InnoDB;SELECTemailFROMUserInfoWHEREusername=‘David’If:聚集索引高度:4&&辅助索引高度:3Then:一共需要7次逻辑IODTCC2012DTCC2012简单查询优化-书签查找优化•分表–LikeIndexCoverageCREATETABLEUserInfo(useridINTNOTNULLAUTO_INCREMENT,usernameVARCHAR(30),registdateDATETIME,PRIMARYKEY(userid),UNIQUEKEYidx_username(username),KEYidex_registdate(registdate));CREATETABLEUserInfoDetail(useridINTNOTNULLAUTO_INCREMENT,usernameVARCHAR(30),emailVARCHAR(50),PRIMARYKEY(userid),UNIQUEKEYidx_username(username));SELECTemailFROMUserInfoDetailWHEREusername=‘David’If:辅助索引高度:3Then:逻辑IO减少为3DTCC2012DTCC2012简单查询优化-总结•每个页填充率高–包含的记录多•减少IO次数–IO=性能–OLTPonly•利用索引覆盖技术避免书签查找•利用分表技术避免书签查找DTCC2012DTCC2012复杂查询优化•复杂查询–OLAP•JOIN•Subquery•复杂查询特点–数据量大–并发少–需访问较多的数据–索引并不再是唯一的优化方向–调优工作复杂DTCC2012DTCC2012复杂查询优化-JOIN•MySQLJOIN类型–SimpleNestedLoopsJoin–BlockNestedLoopsJoin•MySQL5.5+–ClassicHashJoin•MariaDB5.3+•BlockHashNestedLoopsJoinDTCC2012DTCC2012复杂查询优化-SNLJForeachtuplerinRdoForeachtuplesinSdoIfrandssatisfythejoinconditionThenoutputthetupler,sForeachtuplerinRdolookuprjoinconditioninSindexiffounds==rThenoutputthetupler,sScancost(noindex)=Rn+Rn×Sn=O(Rn×Sn)Scancost(withindex)=Rn+Rn×SBH==O(Rn)DTCC2012DTCC2012复杂查询优化-SNLJ•INNERJOINwithIndex–INNERJOIN联接顺序可更改–优化器喜欢选择较小的表作为外部表–Scancost(withindex)=Rn+Rn×SBH==O(Rn)SELECTb.emp_no,a.title,a.from_date,a.to_dateFROMtitlesaINNERJOINemployeesbona.emp_no=b.emp_no;mysqlS
本文标题:MySQL查询优化实践
链接地址:https://www.777doc.com/doc-6182487 .html