Bregman-Divergences

MatrixNearnessProblemsusingBregmanDivergencesInderjitS.DhillonTheUniversityofTexasatAustinHouseholderSymposiumXVISevenSprings,PAMay24,2005jointworkwithA.Banerjee,H.Cho,J.Ghosh,Y.Guan,S.Merugu,D.Modha,S.SraandJ.TroppBregmanDivergencesLetϕ:S→Rbeadifferentiable,strictlyconvexfunctionof“Legendretype”(S⊆Rd)TheBregmanDivergenceDϕ:S×int(S)→RisdeﬁnedasDϕ(x,y)=ϕ(x)−ϕ(y)−(x−y)T∇ϕ(y)BregmanDivergencesLetϕ:S→Rbeadifferentiable,strictlyconvexfunctionof“Legendretype”(S⊆Rd)TheBregmanDivergenceDϕ:S×int(S)→RisdeﬁnedasDϕ(x,y)=ϕ(x)−ϕ(y)−(x−y)T∇ϕ(y)yxϕ(z)=12zTzh(z)Dϕ(x,y)=12kx−yk2SquaredEuclideandistanceisaBregmandivergenceBregmanDivergencesLetϕ:S→Rbeadifferentiable,strictlyconvexfunctionof“Legendretype”(S⊆Rd)TheBregmanDivergenceDϕ:S×int(S)→RisdeﬁnedasDϕ(x,y)=ϕ(x)−ϕ(y)−(x−y)T∇ϕ(y)yxDϕ(x,y)=xlogxy−x+yh(z)ϕ(z)=zlogzRelativeEntropy(alsocalledKL-divergence)isanotherBregmandivergenceBregmanDivergencesLetϕ:S→Rbeadifferentiable,strictlyconvexfunctionof“Legendretype”(S⊆Rd)TheBregmanDivergenceDϕ:S×int(S)→RisdeﬁnedasDϕ(x,y)=ϕ(x)−ϕ(y)−(x−y)T∇ϕ(y)yxDϕ(x,y)=xy−logxy−1h(z)ϕ(z)=−logzItakura-SaitoDistance(usedinsignalprocessing)isanotherBregmandivergenceBregmanDivergencesFunctionNameϕ(x)domϕDϕ(x,y)Squarednorm12x2(−∞,+∞)12(x−y)2Shannonentropyxlogx−x[0,+∞)xlogxy−x+yBitentropyxlogx+(1−x)log(1−x)[0,1]xlogxy+(1−x)log1−x1−yBurgentropy−logx(0,+∞)xy−logxy−1Hellinger−√1−x2[−1,1](1−xy)(1−y2)−1/2−(1−x2)1/2`pquasi-norm−xp(0p1)[0,+∞)−xp+pxyp−1−(p−1)yp`pnorm|x|p(1p∞)(−∞,+∞)|x|p−pxsgny|y|p−1+(p−1)|y|pExponentialex(−∞,+∞)ex−(x−y+1)eyInverse1/x(0,+∞)1/x+x/y2−2/yPropertiesofBregmanDivergencesDϕ(x,y)≥0,andequals0iffx=yPropertiesofBregmanDivergencesDϕ(x,y)≥0,andequals0iffx=yNotametric(symmetry,triangleinequalitydonothold)PropertiesofBregmanDivergencesDϕ(x,y)≥0,andequals0iffx=yNotametric(symmetry,triangleinequalitydonothold)Strictlyconvexintheﬁrstargument,butnotconvex(ingeneral)inthesecondargumentPropertiesofBregmanDivergencesDϕ(x,y)≥0,andequals0iffx=yNotametric(symmetry,triangleinequalitydonothold)Strictlyconvexintheﬁrstargument,butnotconvex(ingeneral)inthesecondargumentThree-pointpropertygeneralizesthe“Lawofcosines”:Dϕ(x,y)=Dϕ(x,z)+Dϕ(z,y)−(x−z)T(∇ϕ(y)−∇ϕ(z))BregmanProjectionsNearnessinBregmandivergence:the“Bregman”projectionofyontoaconvexsetΩ,PΩ(y)=argminω∈ΩDϕ(ω,y)BregmanProjectionsNearnessinBregmandivergence:the“Bregman”projectionofyontoaconvexsetΩ,PΩ(y)=argminω∈ΩDϕ(ω,y)yPΩ(y)ΩBregmanProjectionsNearnessinBregmandivergence:the“Bregman”projectionofyontoaconvexsetΩ,PΩ(y)=argminω∈ΩDϕ(ω,y)yxPΩ(y)ΩBregmanProjectionsNearnessinBregmandivergence:the“Bregman”projectionofyontoaconvexsetΩ,PΩ(y)=argminω∈ΩDϕ(ω,y)yxPΩ(y)ΩGeneralizedPythagorasTheorem:Dϕ(x,y)≥Dϕ(x,PΩ(y))+Dϕ(PΩ(y),y)WhenΩisanafﬁneset,theaboveholdswithequalityHistoricalReferencesL.M.Bregman.“Therelaxationmethodofﬁndingthecommonpointofconvexsetsanditsapplicationtothesolutionofproblemsinconvexprogramming.”USSRComputationalMathematicsandPhysics,7:200-217,1967.Problem:minϕ(x)subjecttoaTix=bi,i=0,...,m−1Bregman’scyclicprojectionmethod:1.Startwithappropriatex(0).Computex(t+1)tobetheBregmanprojectionofx(t)ontothei-thhyperplane(i=tmodm)fort=0,1,2,...Convergestogloballyoptimalsolution.Thiscyclicprojectionmethodcanbeextendedtohalfspaceandconvexconstraints,whereeachprojectionisfollowedbyacorrection.Question:WhatrolecanBregmanDivergencesplayindataanalysis?ExponentialFamiliesofDistributionsDeﬁnition.AregularexponentialfamilyisafamilyofprobabilitydistributionsonRdwithdensityfunctionparameterizedbyθ:pψ(x|θ)=exp{xTθ−ψ(θ)−gψ(x)}ψistheso-calledcumulantfunction,andisaconvexfunctionofLegendretypeExponentialFamiliesofDistributionsDeﬁnition.AregularexponentialfamilyisafamilyofprobabilitydistributionsonRdwithdensityfunctionparameterizedbyθ:pψ(x|θ)=exp{xTθ−ψ(θ)−gψ(x)}ψistheso-calledcumulantfunction,andisaconvexfunctionofLegendretypeExample—considersphericalGaussiansparameterizedbymeanμ(withﬁxedvarianceσ):p(x)=1(2πσ2)dexp−12σ2kx−μk2=1(2πσ2)dexpxTμσ2−σ22μσ22−xTx2σ2Thusθ=μσ2,andψ(θ)=σ22θ2ExponentialFamiliesofDistributionsDeﬁnition.AregularexponentialfamilyisafamilyofprobabilitydistributionsonRdwithdensityfunctionparameterizedbyθ:pψ(x|θ)=exp{xTθ−ψ(θ)−gψ(x)}ψistheso-calledcumulantfunction,andisaconvexfunctionofLegendretypeExample—considersphericalGaussiansparameterizedbymeanμ(withﬁxedvarianceσ):p(x)=1(2πσ2)dexp−12σ2kx−μk2=1(2πσ2)dexpxTμσ2−σ22μσ22−xTx2σ2Thusθ=μσ2,andψ(θ)=σ22θ2Note:Gaussiandistribution←→SquaredLossExamplePoissonDistribution:p(x)=λxx!e−λ,x∈Z+ThePoissonDistributionisamemberoftheexponentialfamilyIsthereaDivergenceassociatedwiththePoissonDistribution?ExamplePoissonDistribution:p(x)=λxx!e−λ,x∈Z+ThePoissonDistributionisamemberoftheexponentialfamilyIsthereaDivergenceassociatedwiththePoissonDistribution?YES—p(x)canbewrittenasp(x)=exp{−Dϕ(x,μ)−gϕ(x)},whereDϕistheRelativeEntropy,i.e.,Dϕ(x,μ)=xlogxμ−x+μExamplePoissonDistribution:p(x)=λxx!e−λ,x∈Z+ThePoissonDistributionisamemberoftheexponentialfamilyIsthereaDivergenceassociatedwiththePoissonDistribution?YES—p(x)canbewrittenasp(x)=exp{−Dϕ(x,μ)−gϕ(x)},whereDϕistheRela

Bregman-Divergences

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

25T型客车电气原理培训

国有控股保险公司的人力资本投资问题研究

银行金融知识

SPC教育训练

社会发展过程及其动力

草原征占用审核审批管理办法(doc5)

公司企业年会策划筹备手册（指导性强）（DOC8页）

第七章运营资金管理

行政总监岗位说明书

公文写作培训课件(新标)

相关文档

相关搜索