您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 市场营销 > Hadoop Hardware @Twitter
HadoopHardware@Twitter:Sizedoesmatter.@joepand@eecraftHadoopSummit2013v2.3@Twitter#HadoopSummit20132JoepRottinghuisSoftwareEngineer@TwitterEngineeringManagerHadoop/HBaseteam@TwitterFollowme@joepJayShenoyHardwareEngineer@TwitterEngineeringManagerHW@TwitterFollowme@eecraftHW&Hadoopteams@Twitter,Manyothers•••••••••Aboutus@Twitter#HadoopSummit20133ScaleofHadoopClustersSingleversusmultipleclustersTwitterHadoopArchitectureHardwareinvestigationsResults•••••Agenda@Twitter#HadoopSummit2013Scale4ScalinglimitsJobTracker10’sthousandsofjobsperday;10’sKsconcurrentslotsNamenode250-300MobjectsinsinglenamespaceNamenode@~100GBheap-fullGCpausesShippingjobjarsto1,000’sofnodesJobHistoryserveratafew100’sKjobhistory/conffiles••••••#Nodes@Twitter#HadoopSummit2013When/whytosplitclusters?5InprinciplepreferenceforsingleclusterCommonlogs,sharedfreespace,reducedadminburden,morerackdiversityVaryingSLA’sWorkloaddiversityStorageintensiveProcessing(CPU/DiskIO)intensiveNetworkintensiveDataaccessHot,Warm,Cold•••••••••@Twitter#HadoopSummit2013ClusterArchitecture6@Twitter#HadoopSummit2013Hardwareinvestigations7@Twitter#HadoopSummit20138HadoopdoesnotneedliveHDDswapTwitterDC:NoSLAondatanodesRackSLA:Only1rackdownatanytimeinacluster•••Servicecriteriaforhardware@Twitter#HadoopSummit20139BaselineHadoopServer(~early2012)E56xxDIMMDIMMDIMME56xxDIMMDIMMDIMMPCHNICGbEHBAExpanderWorksforthegeneralcluster,but...NeedmoredensityforstoragePotentialIObottlenecks••Characteristics:Standard2Userver20servers/rackE5645CPUDual6-core72GBmemory12x2TBHDD2x1GbE•••••••@Twitter#HadoopSummit201310HadoopServer:PossibleevolutionCharacteristics:+CPUperformance?20servers/rackCandidateforDW•NICGbEHBAExpander16x2T?16x3T?24x3T?E5-26xxorE5-24xxDIMMDIMMDIMMDIMME5-26xxorE5-24xxDIMMDIMMDIMMDIMM10GbE?CandeployintothegeneralDWcluster,but...ToomuchCPUforstorageintensiveappsServerfailuredomaintoolargeifwescaleupdisks••@Twitter#HadoopSummit2013Rethinkinghardwareevolution11DebunkingmythsBiggerisalwaysbetterOnesizefitsallBacktoHadoopHardwareRoots:Scalehorizontally,notverticallyTwitterHadoopServer-“THS”•••••@Twitter#HadoopSummit201312NICSASHBAE3-12xxDIMMDIMMPCHGbETHSforbackupsStoragefocus:Costefficient(singlesocket,3Tdrives)Lessmemoryneeded••Characteristics:+IOPerformanceFewfastcoresE3-1230V2CPU16GBmemory12x3TBHDDSSDboot2x1GbE••••••@Twitter#HadoopSummit201313THSvariantforHadoop-ProcandHBaseNICSASHBA10GbEE3-12xxDIMMDIMMPCHCharacteristics:+IOPerformanceFewfastcoresE3-1230V2CPU32GBmemory12x1TBHDDSSDboot1x10GbE••••••Processing/throughputfocus:Costefficient(singlesocket,1Tdrives)MorediskandnetworkIOpersocket••@Twitter#HadoopSummit201314THSforcoldclusterNICSASHBAE3-12xxDIMMDIMMPCHGbECharacteristics:DiskEfficiencySomecomputeE3-1230V2CPU32GBmemory12x3TBHDD2x1GbE••••••Combinationofprevious2usecases:Space&powerefficientStoragedenseandsomeprocessingcapabilities••@Twitter#HadoopSummit201315Rack-levelviewBaselineTwitterHadoopServerBackupsProcColdPower~8kW~8kW~8kW~8kWCPUsockets;DRAM40;1440GB40;640GB40;1280GB40;1280GBSpindles;TBraw240;480TB480;1,440TB480;480TB480;1,440TBUplink;InternalBW20;40Gbps20;80Gbps40;400Gbps20;80Gbps1GTOR1GTOR1GTOR1GTOR1GTOR10GTOR@Twitter#HadoopSummit201316ProcessingperformancecomparisonBenchmarkBaselineServerTHS(-Cold)TestDFSIO(writereplication=1)360MB/s/node780MB/s/nodeTeraGen(30TBreplication=3)1:36hrs1:35hrsTeraSort(30TB,replication=3)6:11hrs4:22hrs2ParallelTeraSort(30TBeach,replication=3)10:36hrs6:21hrsApplication#14:37min3:09minApplicationset#213:3hrs10:57hrsPerformancebenchmarksetup:Eachclusters102nodesofrespectivetypeEfficientserver=3racks,Baseline5+racks“Dated”stack:CentOS5.5,Sun1.6JRE,Hadoop2.0.3•••@Twitter#HadoopSummit2013Results17@Twitter#HadoopSummit201316LZOperformancecomparison18@Twitter#HadoopSummit2013Recap19AtacertainscaleitmakessensetosplitintomultipleclustersForus:RT,PROC,DW,COLD,BACKUPS,TST,EXPForlargeenoughclusters,dependingonuse-case,itmaybeworthtochoosedifferentHWconfigurations•••@Twitter#HadoopSummit2013Conclusion20@Twitterour“TwitterHadoopServer”notonlysavesmany$$$,itisalsofaster!#ThankYou@joepand@eecraftCometalktousatbooth26
本文标题:Hadoop Hardware @Twitter
链接地址:https://www.777doc.com/doc-3177977 .html