您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 信息化管理 > 培乐园-海量数据之架构和处理5
2.Infrastructure:MapReduce2.Infrastructure:HBASE2.Infrastucture:HBASE2.Infrastructure:design•Scalability,Reliability,Performance,Throughput,LatencyScalability,Reliability,Performance,Throughput,LatencyScalability,Reliability,Performance,Throughput,LatencyScalability,Reliability,Performance,Throughput,Latency•Design:Design:Design:Design:–Partitioning/Sharding–Consistenthash–Consistencymodel–DataModels–Storagelayouts–LogStructuredMergeTree(bigtable)•Notes:Notes:Notes:Notes:–StrictConsistency–EventualConsistency–TimestampandVectorClocks–Gossip–Primarykey-value/blob/structure/semi-strucure–SecondaryIndexes–Tables/Namespaces–Multi-versionStorage–Row-based,Column-basedstorage–BloomFilters3.BigData3.BigData:hypergrowth•Reuters-21578:about10Kdocs(ModApte)•RCV1:about807Kdocs•LinkedInjobtitledata:about100MdocsBekkermanetal,SIGIR2001Bekkermanetal,SIGIR2001Bekkermanetal,SIGIR2001Bekkermanetal,SIGIR2001Bekkerman&Scholz,CIKM2008Bekkerman&Scholz,CIKM2008Bekkerman&Scholz,CIKM2008Bekkerman&Scholz,CIKM2008BekkermanBekkermanBekkermanBekkerman&&&&GavishGavishGavishGavish,KDD2011,KDD2011,KDD2011,KDD2011FromKDD20113.BigData:hypergrowthhoursdaysmonthsyears3.BigData:hypergrowth•Bigness–Volume,Velocity,Size•Structure–Variety,Variability,Complexity3.BigData:MachineLearning•Thousandinstances–Manually•Millioninstances–preprocessing,modeling•Billioninstances–distributedstorage/computing,modelingparallelization•Trillioninstances–……3.BigData:GoogleYoutube•Data:XPB,Trillionrowtables•Query:oracle-mysql-columnIO•ETL:python-sawzall+tenzing+python•Reporting:microstrategy-ABI3.BigData:google•StructureStructureStructureStructure–Relational(HostedSQL)–Record-oriented(Bigtable)–Nested(ProtocolBuffer)–Graphs(Pregel)•AnalysisAnalysisAnalysisAnalysis–Numbercrunching(MR,Flumejava)–Adhoc(Dremel,BigQuery)–Precisevs.Estimate(Sawzall)–Modelgeneration&predication(PredicationAPI)•CoreFeaturesCoreFeaturesCoreFeaturesCoreFeatures–RESTfull–Partitions/Buckets–AccessControl/Auth–Scalable,Fast,Simple3.BigData:teradata3.BigData:teradata3.BigData:warehouse�3.BigData:warehouse��3.BigData:Warehousecollection,hdfs,table,...,storagescollection,hdfs,table,...,storagescollection,hdfs,table,...,storagescollection,hdfs,table,...,storagesAdhocAdhocAdhocAdhocqueryqueryqueryqueryReportingReportingReportingReportingModelingModelingModelingModelingDashboardDashboardDashboardDashboardPre-ProcessedPre-ProcessedPre-ProcessedPre-ProcessedFactTableFactTableFactTableFactTableDWMartsDWMartsDWMartsDWMartsRawLogRawLogRawLogRawLogColdDatasetColdDatasetColdDatasetColdDatasetHotDatasetHotDatasetHotDatasetHotDatasetETLETLETLETLBIToolsBIToolsBIToolsBIToolsDataDataDataDataDiscoveryDiscoveryDiscoveryDiscoveryVisualizatioVisualizatioVisualizatioVisualizationnnn............collectioncollectioncollectioncollection&backup&backup&backup&backupCashflowCashflowCashflowCashflowanalysisanalysisanalysisanalysisTreasuryTreasuryTreasuryTreasuryMarketingMarketingMarketingMarketingCustomerCustomerCustomerCustomerServiceServiceServiceServiceChannelChannelChannelChannelManagmntManagmntManagmntManagmntRisklRisklRisklRisklManagmntManagmntManagmntManagmntAccounts/Accounts/Accounts/Accounts/MMMMisisisisExposureExposureExposureExposureAnaysisAnaysisAnaysisAnaysisProductProductProductProductAnaysisAnaysisAnaysisAnaysisUserUserUserUserBehaviorBehaviorBehaviorBehaviorCompetitorCompetitorCompetitorCompetitor3.BigData:Hive3.BigData:Hive•DataModel–Tables–Partitions–Buckets3.BigData:practice4.Cloud4.Cloud•Softwareservices&businessmodelsSoftwareservices&businessmodelsSoftwareservices&businessmodelsSoftwareservices&businessmodels–SaaS(“softwareasaservice”)•Salesforce,zaho,evernote,dropbox–PaaS(“Platformasaservice”)•Appengine,heroku–IaaS(“Infrastuctureasaservice”)•AmazonEC2,Rackspacecloudserves–BPaaS(“BusinessProcessasaservice”)•MajorPlayersMajorPlayersMajorPlayersMajorPlayers–Users:•Google,Facebook,Microsoft–Services:•Amazon,Microsoft,Rackspace,HP,SAP,ORACLE•Box.net,Dropbox–Infrastructureandequipmentproviders•Juniper,HP,Cisco,Intel,SAP,Oracle4.Cloud
本文标题:培乐园-海量数据之架构和处理5
链接地址:https://www.777doc.com/doc-5521206 .html