您好,欢迎访问三七文档
LEIBNIZCENTERFORRESEARCHINCOMPUTERSCIENCETECHNICALREPORT2003-43ComputingGaussianMixtureModelswithEMusingEquivalenceConstraintsNoamShental,AharonBar-Hillel,TomerHertzandDaphnaWeinshallSchoolofComputerScienceandEngineeringandtheCenterforNeuralComputationTheHebrewUniversityofJerusalem,Jerusalem,Israel91904Email:ffenoam,aharonbh,tomboy,daphnag@cs.huji.ac.ilAbstractGaussianmixturemodelsfordensityestimationareusuallyestimatedinanunsupervisedmanner,usinganExpectationMaximization(EM)procedure.Inthispaperweshowhowequivalenceconstraintscanbeincorporatedintothisprocedure,leadingtoimprovedmodelestimationandimprovedclusteringresults.Equivalenceconstraintsprovideadditionalinformationonpairsofdatapoints,indicatingifthepointsarisefromthesamesource(positiveconstraint)orfromdifferentsources(negativeconstraint).Suchconstraintscanbegatheredautomaticallyinsomelearningproblems,andareanaturalformofsupervisioninothers.WepresentaclosedformEMprocedureforhandlingpositiveconstraints,andaGeneralizedEMprocedureusingaMarkovnetworkfortheincorporationofnegativeconstraints.Usingpubliclyavailabledatasets,wedemonstratethatincorporatingequivalenceconstraintsleadstoconsiderableimprovementinclusteringperformance,andthatouralgorithmoutperformsallavailablecompetitors.Keywords:semi-supervisedlearning,equivalenceconstraints,clustering,EM,Guassianmixturemod-els1IntroductionGaussianMixtureModels(GMM)fordensityestimationarepopularfortwomainreasons:theycanbereliablycomputedbytheefficientExpectationMaximization(EM)algorithm[1],andtheyprovideagener-ativemodelforthewaythedatamayhavebeencreated.Thesecondpropertyinparticularmakesfortheircommonuseforunsupervisedclustering,wheretypicallytheGaussiancomponentsoftheGMMmodelaretakentorepresentdifferentsources.Thisuseiscommonbecausemostotherclusteringalgorithmsarenotgenerative,andthereforecannotprovidepredictionsregardingpreviouslyunseenpoints.1Whenusedforclusteringinthisway,theunderlyingassumption-i.e.,thatthedensityiscomprisedofamixtureofdifferentGaussiansources-isoftenhardtojustify.Itisthereforeimportanttohaveadditionalinformation,whichcansteertheGMMestimationinthe“right”direction.Forexamplewemayhaveaccesstothelabelsofpartofthedataset.Nowtheestimationproblembelongstothesemi-supervisedlearningdomain,sincetheestimationreliesonbothlabeledandunlabeledpoints.Inthispaperwefocusonanothertypeofside-information,inwhichequivalenceconstraintsbetweenafewofthedatapointsareprovided.Morespecifically,weuseanunlabeleddatasetaugmentedbyequiv-alenceconstraintsbetweenpairsofdatapoints,wheretheconstraintsdeterminewhethereachpairwasgeneratedbythesamesourceorbydifferentsources.Wedenotetheformercaseas‘positive’constraints,andthelattercaseas‘negative’constraints,andpresentamethodtoincorporatethemintoanEMprocedure.Whatdoweexpecttogainfromthesemi-supervisedapproachtoGMMestimation?Wemayhopethatintroducingside-informationintotheEMalgorithmwillresultinfasterconvergencetoasolutionofhigherlikelihood.Butmuchmoreimportantly,ourequivalenceconstraintsshouldchangetheGMMlikelihoodfunction.Asaresult,theestimationproceduremaychoosedifferentsolutions,whichwouldhaveotherwisebeenrejectedduetotheirlowrelativelikelihoodintheunconstrainedGMMdensitymodel.Ideallythesolutionobtainedwithsideinformationwillbemorefaithfultothedesiredresults.AsimpleexampledemonstratingthispointisshowninFig.1.Unconstrainedconstrainedunconstrainedconstrained(a)(b)Figure1:Illustrativeexamplestodemonstratetheaddedvalueofequivalenceconstraints.(a)Thedatasetconsistsoftwoverticallyalignedclasses:left-givennoadditionalinformation,theEMalgorithmidentifiestwohorizontalclasses,andthiscanbeshowntobethemaximumlikelihoodsolution(withloglikelihoodof 3500vs.loglikelihoodof 2800forthesolutionshownontheright);right-additionalsideinformationintheformofequivalenceconstraintschangestheprobabilityfunctionandwegetaverticalpartitionasthemostlikelysolution.(b)Thedatasetconsistsoftwoclasseswithpartialoverlap:left-withoutconstraintsthemostlikelysolutionincludestwonon-overlappingsources;right-withconstraintsthecorrectmodelwithoverlappingclasseswasretrievedasthemostlikelysolution.Inallplotsonlytheclassassignmentofnovelun-constrainedpointsisshown.Whydoweuseequivalenceconstraints,ratherthanpartiallabelsasinpriorwork(summarizedbelow)?Ourbasicobservationisthatunlikelabels,inmanyunsupervisedlearningtasksequivalenceconstraintsmaybeextractedwithminimaleffortorevenautomatically.OneexampleiswhenthedataisinherentlysequentialandcanbemodelledbyaMarkovianprocess.Considerforexampleasecuritycameraapplication,wheretheobjectiveistofindalltheframesinwhichthesameintruderappears.Duetothecontinuousnatureofthedata,intrudersextractedfromsuccessiveframesinrthesameclipcanbeassumedtocomefromthesameperson,thusformingpositiveconstraints.Inaddition,twointruderswhichappearsimultaneouslyinfrontoftwocamerascannotbethesameperson,henceanegativeconstraintisautomaticallyestablished.Anotheranalogousexampleisspeakersegmentationandrecognition,inwhichtheconversationbetweenseveralspeakersneedstobesegmentedandclusteredaccordingtospeakeridentity.Here,itma
本文标题:Computing gaussian mixture models with EM using eq
链接地址:https://www.777doc.com/doc-3846125 .html