您好,欢迎访问三七文档
当前位置:首页 > 临时分类 > opentsdb-hbasecon
LessonsLearnedfromOpenTSDBBenoît“tsuna”Sigouretsuna@stumbleupon.comOrwhyOpenTSDBisthewayitisandhowitchangediterativelytocorrectsomeofthemistakesmadeKeyconcepts•DataPoints(time,value)•Metricsproc.loadavg.1m•Tagshost=web42pool=static•Metric+Tags=TimeSeries•Orderofmagnitude:106timeseries,1012datapointsputproc.loadavg.1m12345678900.42host=web42pool=staticOpenTSDB@StumbleUpon•Mainproductionmonitoringsystemfor~2years•Storinghundredsofbillionsofdatapoints•Addingover1billiondatapointsperday•13000datapoints/s→130QPSonHBase•Ifyouhada5nodecluster,thisloadwouldhardlymakeitsweatDo’s•Widerrowstoseekfasterbefore:~4KB/row,after:~20KB•Makewritesidempotentandindependentbefore:startrowsatarbitrarypointsintimeafter:alignrowson10m(then1h)boundaries•StoremoredataperKeyValueRememberyoupayforthekeyalongeachvalueinarow,solargekeysarereallyexpensiveDon’ts•UseHTable/HTablePoolinappserversasynchbase+NettyorFinagle=performance++•Putvariable-lengthfieldsincompositekeysThey’rehardtoscan•ExceedafewhundredregionsperRegionServer“Oversharding”introducesoverheadandmakesrecoveringfromfailuresmoreexpensiveUseasynchbase0s13s25s38s50s48162432scan#Threads0s125s250s375s500s48162432sequentialread#ThreadsHTableasynchbase0s50s100s150s200s48162432sequentialwrite#ThreadsSeedetailedbenchmarkatgoo.gl/8at5VHowOpenTSDBcametobethewayitisQuestions:•HowtostoretimeseriesdataefficientlyinHBase?•Howtoenableconcurrentwriteswithoutsynchronizationbetweenthewriters?•Howtosavespace/memorywhenstoringhundredsofbillionsofdataitemsinHBase?TimeSeriesDatainHBaseTake1123456789011234567892212345678943KeyColumndon’tcarevaluestimestampsSimplestdesign:only1timeseries,1rowwithasingleKeyValueperdatapoint.Supportstime-rangescans.TimeSeriesDatainHBaseTake2foo12345678901foo12345678923fool12345678902KeyColumnmetricnameMetricnamefirstinrowkeyfordatalocality.Problem:can’tstorethemetricastextinrowkeyduetospaceconcernsTimeSeriesDatainHBaseTake30x1123456789010x1123456789230x212345678902KeyColumnmetricIDUseaseparatetabletoassignuniqueIDstometricnames(andtags,notshownhere).IDsgiveusapredictablelengthandachievedesireddatalocality.KeyValue0x1foo0x2foolfoo0x1fool0x2SeparateLookupTable:TimeSeriesDatainHBaseTake4+0+20x11234567890130x1123456789230x212345678902KeyColumnReducethenumberofrowsbystoringmultipleconsecutivedatapointsinthesamerow.Fewerrows=fastertoseektoaspecificrow.TimeSeriesDatainHBaseTake4+0+20x11234567890130x1123456789230x212345678902KeyColumnGotcha#1:widerrowsdon’tsaveanyspace*KeyColumnValue0x11234567890+010x11234567890+230x21234567890+02MisleadingtablerepresentationActualtablestored*UntilmagicprefixcompressionhappensinupcomingHBase0.94TimeSeriesDatainHBaseTake4+0+20x11234567890130x1123456789230x212345678902KeyColumnDevilisinthedetails:whentostartnewrows?Naiveanswer:startonfirstdatapoint,aftersometimestartanewrow.TimeSeriesDatainHBaseTake4+00x110000000001KeyColumnClientTSD2TSD1foo10000000001Firstdatapoint:StartanewrowTimeSeriesDatainHBaseTake4+0+10...0x1100000000012...KeyColumnClientTSD2TSD1foo10000000102Keepaddingpointsuntil...TimeSeriesDatainHBaseTake4+0+10...+5990x1100000000012...42KeyColumnClientTSD2TSD1foo100000059942...somearbitrarylimit,say10minTimeSeriesDatainHBaseTake4+0+10...+5990x1100000000012...420x1100000060051KeyColumnClientTSD2TSD1foo100000061051ThenstartanewrowTimeSeriesDatainHBaseTake4+00x112345678901KeyColumnClientTSD2TSD1foo12345678901ButthisschemefailswithmultipleTSDsCreatenewrowTimeSeriesDatainHBaseTake4+0+20x1123456789013KeyColumnClientTSD2TSD1foo12345678923AddtorowTimeSeriesDatainHBaseTake4+0+20x11234567890130x112345678923KeyColumnClientTSD2TSD1foo12345678923AddtorowCreatenewrowMaybeaconnectionfailureoccurred,clientisretransmittingdatatoanotherTSDOops!TimeSeriesDatainHBase+90+920x11234567800130x212345678002KeyColumnInordertoscaleeasilyandkeepTSDstateless,makewritesindependent&idempotent.Newrule:rowsarealignedon10min.boundariesBasetimestampalwaysamultipleof600Take5TimeSeriesDatainHBase+1890+18920x11234566000130x212345660002KeyColumn1datapointevery~10s=60datapoints/rowNotmuch.Gotowiderrowstofurtherincreaseseekspeed.Onehourrows=6xfewerrowsBasetimestampalwaysamultipleof3600Take6TimeSeriesDatainHBase+1890+18920x11234566000130x212345660002KeyColumnRemember:widerrowsdon’tsaveanyspace!Take6KeyColumnValue0x11234566000+189010x11234566000+189230x21234566000+18902ActualtablestoredKeyiseasily4xbiggerthancolumn+valueandrepeatedTimeSeriesDatainHBase+1890+18920x11234566000130x212345660002KeyColumnTake7KeyColumnValue0x11234566000+189010x11234566000+1890,+18921,30x11234566000+189230x21234566000+18902Actualtablestored+1890+189213Spacesavingsondiskandinmemoryarehuge:datais4x-8xsmaller!Solution:“compact”columnsbyconcatenation¿Questions?opentsdb.netBenoît“tsuna”Sigouretsuna@stumbleupon.comForkmeonGitHubWe’rehiringThinkthisiscool?•Useasynchbase•WidertableTallertable•Makewritesidempotent•Compactyourdata•UseNettyorFinagle•Shortfamilynames•Makewritesindependent•HavepredictablekeysizesSummary
本文标题:opentsdb-hbasecon
链接地址:https://www.777doc.com/doc-4962670 .html