您好,欢迎访问三七文档
大型網站所使用的工具Perlbal-多個網頁伺服器的負載平衡MogileFS-分散式檔案系統有公司認為MogileFS比起Hadoop適合拿來處理小檔案memcached-共享記憶體??把資料庫或其他需要經常讀取的部分,用記憶體快取(Cache)方式存放Moxi-Memcache的PROXYMoreResource:::::王耀聰陳威宇jazz@nchc.org.twwaue@nchc.org.tw教育訓練課程HBaseisadistributedcolumn-orienteddatabasebuiltontopofHDFS.HBaseis..Adistributeddatastorethatcanscalehorizontallyto1,000sofcommodityserversandpetabytesofindexedstorage.DesignedtooperateontopoftheHadoopdistributedfilesystem(HDFS)orKosmosFileSystem(KFS,akaCloudstore)forscalability,faulttolerance,andhighavailability.IntegratedintotheHadoopmap-reduceplatformandparadigm.BenefitsDistributedstorageTable-likeindatastructuremulti-dimensionalmapHighscalabilityHighavailabilityHighperformanceWhouseHBaseAdobe–內部使用(Structuredata)Kalooga–圖片搜尋引擎Meetup–社群聚會網站Streamy–成功從MySQL移轉到HbaseTrendMicro–雲端掃毒架構Yahoo!–儲存文件fingerprint避免重複More-StartedtowardbyChadWaltersandJim2006.11GooglereleasespaperonBigTable2007.2InitialHBaseprototypecreatedasHadoopcontrib.2007.10FirstuseableHBase2008.1HadoopbecomeApachetop-levelprojectandHBasebecomessubproject2008.10~HBase0.18,0.19releasedHBaseIsNot…Tableshaveoneprimaryindex,therowkey.Nojoinoperators.Scansandqueriescanselectasubsetofavailablecolumns,perhapsbyusingawildcard.Therearethreetypesoflookups:Fastlookupusingrowkeyandoptionaltimestamp.FulltablescanRangescanfromregionstarttoend.HBaseIsNot…(2)Limitedatomicityandtransactionsupport.HBasesupportsmultiplebatchedmutationsofsinglerowsonly.Dataisunstructuredanduntyped.NoaccessedormanipulatedviaSQL.ProgrammaticaccessviaJava,REST,orThriftAPIs.ScriptingviaJRuby.WhyBigtable?PerformanceofRDBMSsystemisgoodfortransactionprocessingbutforverylargescaleanalyticprocessing,thesolutionsarecommercial,expensive,andspecialized.VerylargescaleanalyticprocessingBigqueries–typicallyrangeortablescans.Bigdatabases(100sofTB)WhyBigtable?(2)MapreduceonBigtablewithoptionallyCascadingontoptosupportsomerelationalalgebrasmaybeacosteffectivesolution.ShardingisnotasolutiontoscaleopensourceRDBMSplatformsApplicationspecificLaborintensive(re)partitionaingWhyHBase?HBaseisaBigtableclone.ItisopensourceIthasagoodcommunityandpromiseforthefutureItisdevelopedontopofandhasgoodintegrationfortheHadoopplatform,ifyouareusingHadoopalready.IthasaCascadingconnector.HBasebenefitsthanRDBMSNorealindexesAutomaticpartitioningScalelinearlyandautomaticallywithnewnodesCommodityhardwareFaulttoleranceBatchprocessingDataModelTablesaresortedbyRowTableschemaonlydefineit’scolumnfamilies.EachfamilyconsistsofanynumberofcolumnsEachcolumnconsistsofanynumberofversionsColumnsonlyexistwheninserted,NULLsarefree.ColumnswithinafamilyaresortedandstoredtogetherEverythingexcepttablenamesarebyte[](Row,Family:Column,Timestamp)ValueRowkeyColumnFamilyvalueTimeStampMembersMasterResponsibleformonitoringregionserversLoadbalancingforregionsRedirectclienttocorrectregionserversThecurrentSPOFregionserverslavesServingrequests(Write/Read/Scan)ofClientSendHeartBeattoMasterThroughputandRegionnumbersarescalablebyregionserversRegions表格是由一或多個region所構成Region是由其startKey與endKey所指定每個region可能會存在於多個不同節點上,而且是由數個HDFS檔案與區塊所構成,這類region是由Hadoop負責複製實際個案討論–部落格邏輯資料模型一篇Blogentry由title,date,author,type,text欄位所組成。一位User由username,password等欄位所組成。每一篇的Blogentry可有許多Comments。每一則comment由title,author,與text組成。ERD部落格–HBaseTableSchemaRowkeytype(以2個字元的縮寫代表)與timestamp組合而成。因此rows會先後依type及timestamp排序好。方便用scan()來存取Table的資料。BLOGENTRY與COMMENT的”一對多”關係由comment_title,comment_author,comment_text等columnfamilies內的動態數量的column來表示每個Column的名稱是由每則comment的timestamp來表示,因此每個columnfamily的column會依時間自動排序好ArchitectureZooKeeperHBasedependsonZooKeeper(Chapter13)andbydefaultitmanagesaZooKeeperinstanceastheauthorityonclusterstateOperationThe-ROOT-tableholdsthelistof.META.tableregionsThe.META.tableholdsthelistofalluser-spaceregions.Installation(1)$wget*.tar.gz-C/opt/$sudoln-sf/opt/hbase-0.20.3/opt/hbase$sudochown-R$USER:$USER/opt/hbase$sudomkdir/var/hadoop/$sudochmod777/var/hadoop啟動Hadoop…Setup(1)$vim/opt/hbase/conf/hbase-env.shexportJAVA_HOME=/usr/lib/jvm/java-6-sunexportHADOOP_CONF_DIR=/opt/hadoop/confexportHBASE_HOME=/opt/hbaseexportHBASE_
本文标题:HBase,BigTable,Hadop,MapReduce,ZooKepper等――当前大规模网站
链接地址:https://www.777doc.com/doc-5532965 .html