您好,欢迎访问三七文档
当前位置:首页 > 高等教育 > 其它文档 > 基于-ElasticStack-的数据探索与分析
大数据平台架构基于ElasticStack的数据探索与分析4What’sElastic?•Adistributedstartupcompany,since2012‒HQ:MountainView,CAANDAmsterdam,Netherlands‒Withemployeesin27countries(andcounting),spreadacross18timezones,speakingover30languages•WeareworkingonOpenSourceprojects!‒(Luckilysomeofthemarepopular,eg:elasticsearch)•OfferingsupportSubscription,X-pack,CloudandTrainings•Finduson:://听说过“ELK”么?6ButELKisout!我来也!Beats&PacketbeatELKB?BELK?LKBE?BKEL?7Logo8ReleaseBonanza9It’stimetounite!Extensions10The“ElasticStack”,staytogetherfromv5.0UserInterfaceStore,Index,&AnalyzeIngest11ElasticStack能做什么?12Github:EnablePowerfulSearchForBothEnd-UsersAndDevelopers://:UnlockingInterplanetaryDatasetswithReal-TimeSearchPic:=769314Datadog:analysismetricsandtime-seriesdata更多:“ElasticStack”UserInterfaceStore,Index,&AnalyzeIngest17LogsMachineDataDatabasesMessageQueuesSocialWebAPIsSensorsLogstash:Collectfromdiverseinputs1•Collectsdiversesources–Logs+manyothers–Over200plugins•Connectswithlivestreams–Real-Timedata–Wire/Transactiondata–Full-PacketNetworkCapture“ElasticStack”UserInterfaceStore,Index,&AnalyzeIngest19•BeatsarelightweightshippersthatcollectandshipallkindsofoperationaldatatoElasticsearch‒Smallapplication‒Installasagentonyourservers‒WritteninGolang‒Noruntimedependencies‒Singlepurpose:Real-timeapplicationmonitoringSniffsthetrafficbetweenyourservers,parsestheapplication-levelprotocolsonthefly.Built-inprotocols:••••••••••HTTPMySQLPostgreSQLRedisThrift-RPCMongoDBDNSMemcacheICMPAMQP•…Let’sgorealtime!22winlogbeat!ForwardsWindowsEventlogstoElasticsearch23FilebeatAmorelightweightlogshipper•GenericfilteringFlexiblyreducetheamountofdatasentofthewireandstored24TopbeatLiketheUnixtopcommandbutsendstheoutputperiodicallytoElasticsearch.AlsoworksonWindows.SystemwidesystemloadtotalCPUusage…Perprocessstatenamecommandline…Diskusageavailabledisksused,freespace…25That’sMore!Metricbeat:ConnectingNumb3rs•Listenstotheinternal“beat”ofsystemsviaAPIs.“ElasticStack”UserInterfaceStore,Index,&AnalyzeIngest27What’sKibana?KibanaisanopensourceanalyticsandvisualizationplatformdesignedtoworkwithElasticsearch.://github.com/elastic/generator-kibana-plugin28Search&Exploration29Visualization&DashboardExtensions30The“ElasticStack”UserInterfaceStore,Index,&AnalyzeIngest31:“107clusters~1747nodes”@Elastic{ON}16:”~150clusterstotaling~3,500nodeshosting~1.3PBofdata”=1••••••••Real-timeanalyticsTimeseriesdataanalyticsLogginganalyticsSecurityanalyticsFrauddetectionPredictionmodelingRecommendations…32慢着,Elasticsearch不是搜索引擎么?33Youknowforsearch,andanalytics!v0.09.0:Facetsv1.0.0:Aggregationv2.0.0:PipelineAggregation34Aggregation•Analytics柱状图、分布、统计、地理…任何数据能被查询到的数据就能被分析接近实时按需实时计算,~1s刷新间隔可嵌套组合不像facets只有一级35SELECTCOUNT(*),AVG(score)FROM`table`GROUPBYprovince,city---Metrics---BucketsAggregationBuckets:TermsHistogramGeohashgrids…Metrics:min-avg-maxStatsCardinality…36Aggregation==3万英尺高空俯视==PatternsFindsomebeauty(insights)!37以PM2.5数据分析为例Orlikethis!38{“city”:“北京”,“date”:“2016-02-08”,“aq_level”:“严重污染”,“aq_rank”:68,“aqi”:391,“co”:115.5,“no2”:1.888,“o3”:62.2,“pm2_5”:415.7,“range”:“74~500”,“so2”:523.5,“location”:{“lat”:39.92,“lon”:116.46}}数据来源:空气质量统计(北京全年)POSTdemo/_search?size=0{query:{…}aggs:{aq_stats:{terms:{field:aq_level,size:10}}}}41平均空气质量统计(按城市)(nested){aggs:{city_stats:{terms:{field:city,size:10},aggs:{avg_pm25:{avg:{field:pm2_5}}}}}4230天空气质量趋势分析(Pipeline){aggs:{qa_date_histo:{“date_histogram”:{field:date“,interval:day“},aggs:{the_avg:{avg:{field:pm2_5“}},the_movavg:{moving_avg:{buckets_path:the_avg,window:30}}}}}}43Aggregation工作原理•LuceneCollector•Optimizeddatastructure–Compressedcolumnardatastore(previousFieldData,nowDocValues)–Stringsconvertedtoenums(persegment)•Singlepassonyourdata,alonewiththequery–Nomatterhowcomplexofyouraggregation44Aggregation工作原理CoordinatorShardShardShard45Aggregation工作原理CoordinatorShardShardShard46Aggregation工作原理CoordinatorShardShardShard47Aggregation工作原理CoordinatorShardShardShard48Aggregation工作原理TopHitsCollectorAggregationCollectorLuceneIndex/AnESShardSegmentSegmentSearch:InvertIndexAggregation:DocValuesTerm北京上海广州BeijingDocID1,5,
本文标题:基于-ElasticStack-的数据探索与分析
链接地址:https://www.777doc.com/doc-4227230 .html