您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 经营企划 > 搜索引擎技术及发展趋势
SearchEngineTechniquesandTrendsXiaomingLiandYanZhangSchoolofElectronicsEngineeringandComputerScience,PekingUniversityAbstract:Searchengines,invirtueoftheirpowerfulandconvenientaccesstoinformation,areinfiltratingintoalmostallaspectsofthesocialsociety.However,higherrequirementsareconstantlyputtosearchenginesduetothegrowing-upweb,theever-increasinguserdemandsandthechangingnetworkinfrastructure.Meanwhile,somenon-technicalfactorsalsoforcesearchenginesgoingdeeper.Thisreportanalyzesthevariouschallengesfacedbysearchengines,andsummarizesthemainpointsofthesolution.Inaddition,alistoftherelatedresearchgroupsispresented,followedbyanoverlookofthesearchengines'future.2006719[1]66.3%11994AltaVistaYahooInfoseekSteveLawrenceC.LeeGiles19992[3]1116%11998GoogleLycosGoogle[4][23]Web1WebWeb2.0[16]DeepWeb[7]2Web1.0Web2.0Web2.0Web2.0IMInstantMessengerP2PPeertoPeerBlogWeblogRSSRDFSiteSummaryWIKIWBWebBookmarksWebSNSSocialNetworkSoftware—RSSRSS20019120049195150Blog—CCWResearch2006680051.1%210040.0%WebAlexandrosNtoulas[56]Web8%154popular—Blog—DeepWeb—DeepWebInvisibleWebHiddenWebJillEllsworth1994InvisibleWebChrisShermanGaryPriceTheInvisibleWebInvisibleWebInvisibleWebtheOpaqueWebthePrivateWebtheProprietaryWebtheTrulyInvisibleWebdeepinvisibleWebBrightPlanet[7]2001DeepWebSurfaceWeb500ChrisShermanGaryPriceInvisibleWebSurfaceWeb250DeepWebDeepWeb3DeepWebDeepWebWeb1020502WebAlexandrosNtoulas[5]25%24%Google25%RankingRealWebtimelinessfreshness3WebAltaVista5.55.41100/Google4.44SEO—SEOSEOSearchEngineOptimizationSearchEngineOptimizerSEO—Web4SEOUSCensusBureau20046927.8%GDPForresterResearchB2C2010329013%WebSEOSEOWebCrawlerLinkFarm[910]WebSpammerBoostingHidingBoostingHidingBoostingWebCrawlerBoostingTermSpammingLinkSpammingTermSpammingSpammerWebBodyTitleAnchorTextSpammerTermSpammingLinkSpammingWebSpammerHoneyPotLinkFarmLinkFarmLinkFarm[9]LinkSpammingHidingContentHidingCloakingRedirectionContentHidingWebCrawlerCloakingWebCrawlerRedirectionCloakingCrawlerHidingSpammer—2006bmw.com.deGoogle51Ranking2361PDAGmailNokiaYahoo200697Nokia“NokiaMobileSearch”NokiaNseriesS60Yahoo102pdfIPv633GMorganStanley20064106GlobalInternetTrends[8]PCPC7Google1Web—GoogleGooglemp32Google3=20063Google9000GoogleGoogle?81123—452verticalsearchingCrawlerOntologyXMLCiteseerGoogleEarthYahooShoppingShopping.comCiteSeer[11]Microsoft6URLHostNameWebweb[10]StanfordGyongyiHaveliwalaTopic-SensitivePageRankTrustRankweb[12]webspam[13]linkspam10[14]GyongyiTrustRankAndrásA.BenczúrLehighUniversityBaoningWuBrianD.DavisonGoogleYahoo2Silverstein85%10[15]GoogleNewsvivisimoGoogle2001HillTopHillTopGoogleMIBIRS10001GoogleYahooAltaVistaLycosAllTheWeb111862004LoweSIFT(Scale-InvariantFeatureTransform)[17]SIFT2880GoogleYahoo•JacquesChirac2006QuaeroGoogleYahooQuaeroQuaeroLTUTechnologiesFBI12QuaeroQuaeroQuaero5102052.52DeepWebDeepWebStanfordUIUCHiWE[18]MetaQuerier[19]DeepWeb[22]WebInfoMall!Q[20]YahoosemanticnetworkGooglePersonalizedSearch[21]Google2GoogleDesktopMicrosoftOffice3ubiquitousGoogle1%Microsoft14MotionBridgeYahooPCGoogleYahooQuaero200520200735—Google—GoogleGoogleSearchMash.comGoogleSearchMashGoogleMIBIRS—Google151000GoogleMicrosoft://research.microsoft.com/asia/2001Web20063811IPTV20069Yahoo!://infolab.stanford.edu/GoogleWebBaseDigitalLibrariesJunghooChoKevinChangUCLAUIUCCMULTI,://ciir.cs.umass.edu/16CIIRIRIRIFIECERNETWeb://://159.226.40.18/[2]TREC*://210.25.191.143/CERNETCERNET2IPv62003200420073531*863-306172004CWT100g15002006CWT200g3300020022003Web20052005YahooSocialSearchYahooYahooFlickrdel.icio.usMicrosoftMicrosoftMSNGoogleLarryPageGoogle99%99%18SergeyBrinIT……19ITB2BB2CGoogleIntel863…GoogleGoo…Quaero201..200672,..2005.,20068,pp.91-1093SteveLawrenceandC.LeeGiles.“Accessibilityofinformationontheweb”,NATURE,Vol400,July19994L.Page,S.Brin,R.Motwani,andT.Winograd.“ThePageRankCitationRanking:BringOrdertotheWeb”,TechnicalReport,StanfordUniversity,Stanford,CA.19985A.Ntoulas,J.Cho,H.KyuCho,H.ChoandY.J.Cho.AstudyontheevolutionoftheWeb.InProceedingsofthe2005UKCConference,August20056A.Ntoulas,J.ChoandC.Olston.What'sNewontheWeb?TheEvolutionoftheWebfromaSearchEnginePerspective.InProceedingsofthe2004
本文标题:搜索引擎技术及发展趋势
链接地址:https://www.777doc.com/doc-4874569 .html