基于hadoop的海量日志分析计算

H Hadoo op 201 114 UDC CA Analysisa 201 Haandcalcul 114 20adooplationofm 0114massivelog 104gbasedon 2011 10497497208250 nHadoop 430070 5 00 0 I 2005CPU18ITGoogleIBMFaceBookYaohooHadoopHadoopMapReduceHadoopHadoopHadoopHadoopHDFSMapReduceHadoopHadoopHadoopHadoopHadoopHadoopHDFSMapReduce IIAbstract Withdevelopmentofscientifictechnology,thetransistorcircuithasbeengraduallyapproachingitsphysicallimitsontheperformance.Moor’Lawhasceasesedtobeinforceafter2005.ThecomputingpowerofsingleCPUisdoubledevery18monthsthatcannotbepossible.But,peopleon-lineexplode,thesecompanieswhoareprovingservicesonnetworkhavetoanalyzemassiverecordlogseverydayinordertomodifytheproductstomeetthecustomers’srequirementsintime.So,somecriticaldataoftheproductshouldbeprocessedinagiventime.Traditionaldatabasetechnologycannotprovideenoughcomputationalabilityandstoragetoalldatatomeetcustormer’sprocessingdataneeds.Peoplegiveaconceptofcloudcomputingtosolvethisproblem.Thisconceptcometobethedirectioninnearfurther.Nowadays,ITindustrybusinessgiantsuchasGoogle,IBM,FaceBook,YaohooandMicrosofthavetakenitsowncloudcomputingplatformtoprocessmassivedataandprovidecomputationalability.Inthispaper,Google’sHadoopcloudcomputingplatformwasselectedtoenhancethepowerofprocessinglargeoflog.Hadoopisanopensourcedistributedcomputingframework.Thisframeworkowngoodexpandcapactity,cheaperoperatingcosts,higherefficiencyandbetterstability.themore,MapReduceprogrammingmodelcanbecompatiblewithprocessingtextapplicationperfectly.Secondly,Hadoopcandealwithalllowermessagesforprogrammersduringparallelcomputing.Programmersonlyneedtodealwiththelogicalofdataandunnecessarytoconsiderthemessagesbetweentheparallelcomputersonhadoopcloudcomputing.Theprogrammerscanfocusonthecriticalissuesandspeedupprogramdevelopment.So,Hadoopplatformwaswidelyusedlaterreleased.Thispaperin-depthstudiedHadoop’sHDFSandMapReducemodel.AccordingtoHadoop’smodelofprocessingdata,wedesignprocessingdatamodeltofitourbusinessrequirements.Thismodelisappliedtopracticeworktosolvemassivelogprocessingandcutdownthetimeofdataprocessing.ThemostimportisHadoop IIIcloudplatformsolvedsingleseverdataprocessingpowerbottleneck.Inthispaper,Hadoopcloudcomputingplatformwasdesignedandimplemented.Onthehadoopplatform,Thedata-processmodelwasdesignedandimplementedtoresolvelogstatisticsandimprovethespeedofmassivelogprocessing.Programmingfordata-processsomestatisticproductonownHadoopcloudplatformanddosomeperformancetest.Byanalyzingrelationshipbetweencomputingpowerandnumberofworknodes,comparingthecomputingpowerofmultiplenodeswithsingledatabasecomputing,experimentaldatashowhadoophasastrongadvantageofpowerdealingwithmassivedata.Keywords:HadoopHDFSMapReduceCloudcomputingmassivedataprocessingandanalysis i ...........................................................................................................................I Abstract.........................................................................................................................II ...........................................................................................................................I 1...........................................................................................................1 1.1.............................................................................................................1 1.2.............................................................................................3 1.3.................................................................................................5 2...........................................................................................6 2.1HDFS...........................................................................................................6 2.2HDFS......................................................................................................7 2.3.............................................................................................7 2.4...............................................................................10 3HadoopMapReduce...................................................................11 3.1MapReduce................................................................................................11 3.2MapReduce................................................................................12 3.3...........................................................................................13 3.4...........................................................................................................14 4Hadoop................................................................

基于hadoop的海量日志分析计算

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

sap全面概述中文（推荐DOC90）

江西电力职工大学学报

电力半导体器件用散热器选择及使用原则

武汉万科物业管理方案

临床输血程序化管理上海交通大学附属第六人民医院李志

固定床催化法合成吡啶及其衍生物的研究与应用

初探循证药学药历的建立

青岛啤酒技能培训第一部分第三节酿造用水质量指标要求

《文化创新的源泉和作用》2017年最新

方圆公司公共事业部印务中心主管

相关文档

相关搜索