您好,欢迎访问三七文档
BreakingNewsDetectionandTrackinginTwitterSwitPhuvipadawat,TsuyoshiMurataDepartmentofComputerScience,GraduateSchoolofInformationScienceandEngineeringTokyoInstituteofTechnology,Japanswit.p@ai.cs.titech.ac.jp,murata@cs.titech.ac.jpAbstract—Twitterhasbeenusedasoneofthecommuni-cationchannelsforspreadingbreakingnews.Weproposeamethodtocollect,group,rankandtrackbreakingnewsinTwitter.Sinceshortlengthmessagesmakesimilaritycompar-isondifficult,weboostscoresonpropernounstoimprovethegroupingresults.Eachgroupisrankedbasedonpopularityandreliabilityfactors.Currentdetectionmethodislimitedtofactspartofmessages.Wedevelopedanapplicationcalled“Hotstream”basedontheproposedmethod.UserscandiscoverbreakingnewsfromtheTwittertimeline.Eachstoryisprovidedwiththeinformationofmessageoriginator,storydevelopmentandactivitychart.Thisprovidesaconvenientwayforpeopletofollowbreakingnewsandstayinformedwithreal-timeupdates.Keywords-Twitter,TopicDetectionandTracking,Real-timetext-mining,InformationRetrievalI.INTRODUCTIONTwitterisasocialnetworkingservicethatallowsuserstoshareinformation,whichisdescribedbyTwitteras“What’shappening?”inaformofshorttexts(140characters).MaincharactersofTwitterare:brevity—contentsareinshortlengthandsimultaneousness—contentsareupdatedfrequently.Twitterhastransformedthewaypeopleconveyinformationespeciallyintheareasofnews.InJune2009,Twitterhasplayedanimportantroleindeliveringuser-generatedcontentsfromtheIraniancitizenintheIranelection.Weseethatpeoplewithtechnologyplayedaroleofjournalistsinthesituationwherenewsreportinginaconventionalwayhasbeenmadedifficult[1].Anyonewhoisnotassociatedtothemediaindustrycanalsodelivernews.Thus,Twitterpresentsahighlyeffectivewaytodiscoverwhatishappeningaroundtheworld.BreakingnewsisdefinedbyWiktionary[2]as“newsthathaseitherjusthappenedoriscurrentlyhappening.Breakingnewsmaycontainincompleteinformation,factualerrororpooreditingbecauseofrush.”WiththisdefinitionTwittercanfittheneedsofbreakingnewsdelivery.However,newspostedinTwitterrequiresanefforttodiscoverit.Firstly,usersoftenhaveproblemsofdecidingwhichuserstofollow.Thatis,tofinduserswithinterestingtweets[3].Secondly,usersneedtoreadthroughstatusupdatesandfollowlinkstoobtainfurtherinformation.Toeasetheseproblemsandtodeliverbreakingnewseffectively,weproposeamethodtocollect,group,rankandtrackbreakingnewsinTwitter.ThisworkisacontributiontotheareaofTopicDetectionandTracking(TDT)[4].Thetaskswefocusarefirststorydetection,clusterdetection,andtracking.II.CHARACTERISTICSOFBREAKINGNEWSINTWITTERAsapreparatoryexperimentforanalyzingcharacteristicsofbreakingnews,wecollectedmessagesfromTwitterusingtheTwitterAPI.Thedatacontains121,000messagesfrompublicstatusesand33,000messagesfromaselectedgroupof250userswhocontributetobreakingnewspostingsinTwitter.Weselecteduserswhouseabreakingnewshashtag(#breakingnews)intheirmessages.TableICHARACTERISTICSOFMESSAGESINTWITTERBASEDON154,000MESSAGESAMPLESCharacteristicNo.ofoccurrencesPercentageTagauser79,46951.6%Embedalink50,40432.7%Retweet29,93519.4%Useahashtag20,34813.2%TableIshowscharacteristicsofmessagesandthenumberofoccurrences.Incontributiontobreakingnewsdetection,thesecharacteristicshelpusfindmorefactsaboutamessage.Fromusertags,wecanidentifyconversationsbetweenusers.Fromembeddedlinks,wecanfollowthemtofindmorein-formation.Retweetmeanstorepostanotheruser’smessage.Fromanumberofretweets,wecandeterminepopularityorimportanceofamessage.Andfromhashtags,wecangrouptogetherrelatedmessages.Aretweetedmessageoftencontainstheinformationofmessageoriginatorandpreviousmessage.TherearetwoaspectstoconsiderwhendetectingthebreakingnewsinTwitter:SinglemessageaspectandTimelineaspect.Thetwoaspectsaredescribedindetailsasfollows.A.SinglemessageaspectTherearetwoimportantelementsinamessage:emotionsandfacts.TheinclusionofemotionsinthemessagemakesnewsdeliveredinTwitter,differentfromnewsdeliveredbyprofessionaljournalists.Althoughtherearecaseswhereemotionsareconveyedinconventionalnews,expressionofemotionsoccursmuchmoreofteninTwittermessages.Emotionsareexpressedthroughtheuseofsymbols(mainly2010IEEE/WIC/ACMInternationalConferenceonWebIntelligenceandIntelligentAgentTechnology978-0-7695-4191-4/10$26.00©2010IEEEDOI10.1109/WI-IAT.2010.205120Figure1.Emotionsandfactsinamessagetheuseofexclamationmark‘!’),theuseofsensationaladjectivesandphrases:crazy,amazing,great,terrible,won-derful,shocking,ohmygod,etc.Factsareprovidedintext-based,hypertext-based,andthroughlocationandsourceinformationofthemessageoriginator.Text-basedinformationishighlysignificantasithelpsinterrogatethedetailsofthenewsintermsof‘what’,‘where’,‘when’,‘how’,etc.Wecanidentifykeywordsfromfactsthatcontributetonewsstory.Thesekeywordsareidentifiedassignificantnounsandverbs.Significantnounsincludekeywordsfoundinconventionalnews,namesoffamousplaces,peopleandeventssuchasJapan,USpresident,emergencyandairplane.Significantverbsare,forexamples,fire,crash,bomb,survive,rescue,win,etc.Usersoftentagtheirmessagewithahashsymbol(#)followedbykeywordsforexamples,#breakingnews,#haiti,etc.asameantogrouptogethermessagesrelatedtothekeywords.Hy
本文标题:Breaking-News-Detection-and-Tracking-in-Twitter
链接地址:https://www.777doc.com/doc-5226213 .html