您好,欢迎访问三七文档
当前位置:首页 > 行业资料 > 能源与动力工程 > Bregman-Divergences
MatrixNearnessProblemsusingBregmanDivergencesInderjitS.DhillonTheUniversityofTexasatAustinHouseholderSymposiumXVISevenSprings,PAMay24,2005jointworkwithA.Banerjee,H.Cho,J.Ghosh,Y.Guan,S.Merugu,D.Modha,S.SraandJ.TroppBregmanDivergencesLetϕ:S→Rbeadifferentiable,strictlyconvexfunctionof“Legendretype”(S⊆Rd)TheBregmanDivergenceDϕ:S×int(S)→RisdefinedasDϕ(x,y)=ϕ(x)−ϕ(y)−(x−y)T∇ϕ(y)BregmanDivergencesLetϕ:S→Rbeadifferentiable,strictlyconvexfunctionof“Legendretype”(S⊆Rd)TheBregmanDivergenceDϕ:S×int(S)→RisdefinedasDϕ(x,y)=ϕ(x)−ϕ(y)−(x−y)T∇ϕ(y)yxϕ(z)=12zTzh(z)Dϕ(x,y)=12kx−yk2SquaredEuclideandistanceisaBregmandivergenceBregmanDivergencesLetϕ:S→Rbeadifferentiable,strictlyconvexfunctionof“Legendretype”(S⊆Rd)TheBregmanDivergenceDϕ:S×int(S)→RisdefinedasDϕ(x,y)=ϕ(x)−ϕ(y)−(x−y)T∇ϕ(y)yxDϕ(x,y)=xlogxy−x+yh(z)ϕ(z)=zlogzRelativeEntropy(alsocalledKL-divergence)isanotherBregmandivergenceBregmanDivergencesLetϕ:S→Rbeadifferentiable,strictlyconvexfunctionof“Legendretype”(S⊆Rd)TheBregmanDivergenceDϕ:S×int(S)→RisdefinedasDϕ(x,y)=ϕ(x)−ϕ(y)−(x−y)T∇ϕ(y)yxDϕ(x,y)=xy−logxy−1h(z)ϕ(z)=−logzItakura-SaitoDistance(usedinsignalprocessing)isanotherBregmandivergenceBregmanDivergencesFunctionNameϕ(x)domϕDϕ(x,y)Squarednorm12x2(−∞,+∞)12(x−y)2Shannonentropyxlogx−x[0,+∞)xlogxy−x+yBitentropyxlogx+(1−x)log(1−x)[0,1]xlogxy+(1−x)log1−x1−yBurgentropy−logx(0,+∞)xy−logxy−1Hellinger−√1−x2[−1,1](1−xy)(1−y2)−1/2−(1−x2)1/2`pquasi-norm−xp(0p1)[0,+∞)−xp+pxyp−1−(p−1)yp`pnorm|x|p(1p∞)(−∞,+∞)|x|p−pxsgny|y|p−1+(p−1)|y|pExponentialex(−∞,+∞)ex−(x−y+1)eyInverse1/x(0,+∞)1/x+x/y2−2/yPropertiesofBregmanDivergencesDϕ(x,y)≥0,andequals0iffx=yPropertiesofBregmanDivergencesDϕ(x,y)≥0,andequals0iffx=yNotametric(symmetry,triangleinequalitydonothold)PropertiesofBregmanDivergencesDϕ(x,y)≥0,andequals0iffx=yNotametric(symmetry,triangleinequalitydonothold)Strictlyconvexinthefirstargument,butnotconvex(ingeneral)inthesecondargumentPropertiesofBregmanDivergencesDϕ(x,y)≥0,andequals0iffx=yNotametric(symmetry,triangleinequalitydonothold)Strictlyconvexinthefirstargument,butnotconvex(ingeneral)inthesecondargumentThree-pointpropertygeneralizesthe“Lawofcosines”:Dϕ(x,y)=Dϕ(x,z)+Dϕ(z,y)−(x−z)T(∇ϕ(y)−∇ϕ(z))BregmanProjectionsNearnessinBregmandivergence:the“Bregman”projectionofyontoaconvexsetΩ,PΩ(y)=argminω∈ΩDϕ(ω,y)BregmanProjectionsNearnessinBregmandivergence:the“Bregman”projectionofyontoaconvexsetΩ,PΩ(y)=argminω∈ΩDϕ(ω,y)yPΩ(y)ΩBregmanProjectionsNearnessinBregmandivergence:the“Bregman”projectionofyontoaconvexsetΩ,PΩ(y)=argminω∈ΩDϕ(ω,y)yxPΩ(y)ΩBregmanProjectionsNearnessinBregmandivergence:the“Bregman”projectionofyontoaconvexsetΩ,PΩ(y)=argminω∈ΩDϕ(ω,y)yxPΩ(y)ΩGeneralizedPythagorasTheorem:Dϕ(x,y)≥Dϕ(x,PΩ(y))+Dϕ(PΩ(y),y)WhenΩisanaffineset,theaboveholdswithequalityHistoricalReferencesL.M.Bregman.“Therelaxationmethodoffindingthecommonpointofconvexsetsanditsapplicationtothesolutionofproblemsinconvexprogramming.”USSRComputationalMathematicsandPhysics,7:200-217,1967.Problem:minϕ(x)subjecttoaTix=bi,i=0,...,m−1Bregman’scyclicprojectionmethod:1.Startwithappropriatex(0).Computex(t+1)tobetheBregmanprojectionofx(t)ontothei-thhyperplane(i=tmodm)fort=0,1,2,...Convergestogloballyoptimalsolution.Thiscyclicprojectionmethodcanbeextendedtohalfspaceandconvexconstraints,whereeachprojectionisfollowedbyacorrection.Question:WhatrolecanBregmanDivergencesplayindataanalysis?ExponentialFamiliesofDistributionsDefinition.AregularexponentialfamilyisafamilyofprobabilitydistributionsonRdwithdensityfunctionparameterizedbyθ:pψ(x|θ)=exp{xTθ−ψ(θ)−gψ(x)}ψistheso-calledcumulantfunction,andisaconvexfunctionofLegendretypeExponentialFamiliesofDistributionsDefinition.AregularexponentialfamilyisafamilyofprobabilitydistributionsonRdwithdensityfunctionparameterizedbyθ:pψ(x|θ)=exp{xTθ−ψ(θ)−gψ(x)}ψistheso-calledcumulantfunction,andisaconvexfunctionofLegendretypeExample—considersphericalGaussiansparameterizedbymeanμ(withfixedvarianceσ):p(x)=1 (2πσ2)dexp −12σ2kx−μk2 =1 (2πσ2)dexpxT μσ2 −σ22 μσ2 2−xTx2σ2Thusθ=μσ2,andψ(θ)=σ22θ2ExponentialFamiliesofDistributionsDefinition.AregularexponentialfamilyisafamilyofprobabilitydistributionsonRdwithdensityfunctionparameterizedbyθ:pψ(x|θ)=exp{xTθ−ψ(θ)−gψ(x)}ψistheso-calledcumulantfunction,andisaconvexfunctionofLegendretypeExample—considersphericalGaussiansparameterizedbymeanμ(withfixedvarianceσ):p(x)=1 (2πσ2)dexp −12σ2kx−μk2 =1 (2πσ2)dexpxT μσ2 −σ22 μσ2 2−xTx2σ2Thusθ=μσ2,andψ(θ)=σ22θ2Note:Gaussiandistribution←→SquaredLossExamplePoissonDistribution:p(x)=λxx!e−λ,x∈Z+ThePoissonDistributionisamemberoftheexponentialfamilyIsthereaDivergenceassociatedwiththePoissonDistribution?ExamplePoissonDistribution:p(x)=λxx!e−λ,x∈Z+ThePoissonDistributionisamemberoftheexponentialfamilyIsthereaDivergenceassociatedwiththePoissonDistribution?YES—p(x)canbewrittenasp(x)=exp{−Dϕ(x,μ)−gϕ(x)},whereDϕistheRelativeEntropy,i.e.,Dϕ(x,μ)=xlog xμ −x+μExamplePoissonDistribution:p(x)=λxx!e−λ,x∈Z+ThePoissonDistributionisamemberoftheexponentialfamilyIsthereaDivergenceassociatedwiththePoissonDistribution?YES—p(x)canbewrittenasp(x)=exp{−Dϕ(x,μ)−gϕ(x)},whereDϕistheRela
本文标题:Bregman-Divergences
链接地址:https://www.777doc.com/doc-3940997 .html