您好,欢迎访问三七文档
当前位置:首页 > 行业资料 > 能源与动力工程 > 灰色预测模型对准确推荐在存在数据稀疏和相关性
clusteringCFmodelsBayesianbeliefnets(BNs)CFmodelsMarkovdecisionprocessbased(MDP-based)CFmodelslatentsemanticCFmodels利用降维技术来处理数据稀疏性的问题(SVD)丢失关键数据本论文中是利用simplestmethod(CosineDistancemeasurementmethod),来处理原理:wedonotdirectlyusetheexactvalueofthesimilarities,butratherranktheitemsaccordingtotheirsimilarities可以应用的领域:suchasfinance[23],integratedcircuitindustry[24],themarketforairtravel[25],andundergroundpressureforworkingsurface[26].实验的数据集:MovieLensandEachMovie论文构架:2部分是对传统CF方法的描述,基于CF(ICF)methods,对存在问题的描述,本人的贡献3部分详细描述了基于算法提出的GF模型4部分描述了实验的研究,包括实验的数据集,评估的度量,方法,实验的分析,总结和将来的工作。2.主要的工作可以分为两个大的部分:1.相似度的测量和2.评分的预测相似度的测量方法:1.InICFmeth-ods,thesimilaritysðix;iyÞbetweentheitemsix,andiyisdeterminedbytheuserswhohaveratedboththeitems2.最流行的方法:余弦距离和皮尔逊相关,运算原理:letIbethesetofallitemsratedbyboththeusersux,anduy,andletUbethesetofalluserswhohaveratedtheitemsix,andiy例如:题目:(itemsetI)是BreadandMilk,disequaltothesizeofsetI.Inthiscase,disequaltotwo(d=2)Cake(ix)andMilk(iy)areratedbybothAliceandLucy(usersetU)2.相似度的计算方法2.1.1余弦距离用来计算两个向量之间的相似度ForUCF,thesimilaritybetweentwouserswithCosineDistancemethodcanbecalculatedasfollows:CosineDistance用户之间的相似度是用户ux和用户uy对项目i的评分,ForICF则为:物品之间的相似度是用户u对项目ix,iy的评分2.1.2皮尔逊相关系数在相似度的计算过程中,消除评分相关性,可以利用平均评分来消除,皮尔逊相关系数在一定的程度上提高了相似度计算的准确度,对于用户之间的相似度计算如下:是用户对所有电影评分的平均值对于物品的计算则如下:2.2评分预测思路:ThekNearestNeighbors(KNN)method[37]isusuallyusedforpredictionbyweightingthesumoftheratingsthatsimilarusersgivetothetargetitemortheratingsoftheactiveuseronsimilaritemsdependingonwhetherUCForICFisused2.2.1用户之间思想:isbasedonthebasicassumptionthatpeoplewhosharesimilarpastpreferenceswillbeinterestedinsimilaritems.算法步骤:first,thesimilaritiesbetweentheusersarecomputedusingsimilaritymeasurementmethodsintroducedinSection2.1;then,thepredictionfortheactiveuserisdeterminedbytakingtheweightedaverageofalltheratingsofthesimilarusersforacertainitem[37]accordingtotheformulainEq.(5);finally,theitemswiththehighestpre-dictedratingswillberecommendedtotheuserwhereU(ux)denotesthesetofuserssimilartotheuserux,andpux;iisthepredictionfortheuseruxonitemi2.2.2物品之间思想:algorithmrecommendsitemstousersthataresimilartotheitemsthattheyhavealreadyconsumed.Similarly,aftercal-culatingthesimilaritiesbetweentheitems,whereI(ix)denotesthesetofsimilaritemsofitemix.Further,pu;ixdenotesthepredictionofuseruonitemix.2.3问题分析从数据的稀疏性和数据的相关性不用数据本身,即就是对数据本身排序使用本文的重点:weonlyranktheitemsaccord-ingtothesimilarity.用户的相似度和物品相似度,不是去利用计算出来的相似度,而是利用计算出来的排序相似度本身存在误差Then,togeneratethepredictionoftheactiveuseruonitemi,thekmostsimilaritemsthathavebeenratedbytheactiveuseronitemiareselected.Finally,weusetheseitemsastheinputtobuildaGFmodelandpredicttheratingoftheactiveuseruonitemi.Iftheuserudoesnotratekitems,afixedvaluewillbeusedtocompletethekratings.Empirically,thefixedvaluecanbethemedianvalueoftheratingscale.Forexample,whentheratingscaleis1–5,thenumber3isselectedasthefixedvalue.Theproposedmethodprovidesthefollowingthreemaincontributions:优点:1.Overcomingdatasparsity2.Benefitingfromdatacorrelation3.Obtainingaccuratepredictions.3.提出算法思想:ratingsofsimilarusersforatargetitemorratingsoftheactiveuserforsimilaritemstogenerateprediction。Inthispaper,theGFmodelisusedforratingprediction.Itinvolvestwosteps:ratingpreprocess-ingandratingprediction.3.1.Ratingpreprocessing利用物品之间的相似度来产生评分的预测,算法步骤:First,forsimplicity,theCosineDistancemethodisutilizedtocomputethesimilaritybetweentwoitems.Then,anmmsimilaritymatrixisgenerated,wheremisthenumberofitems.Ifwewanttopredicttheunratedentryoftheuseruonitemiintheratingmatrix,thekmostsimilaritemstotheitemithathavebeenratedbytheuseruareselected.Notethatwhentheuserudoesnotratekitems,thefixedvaluewiththelowestsimilaritywillbeusedtocompletethekratings.Finally,thekratingsaresortedaccordingtotheirincrementalsimilaritiestotheitemitoproducearatingsequence.Inthenextstep,thepro-posedalgorithminputstheratingsequencetotheGFmodelandforecaststheratingthattheuseruwillgivetoitemi.计算出物品之间的思想度后,把物品之间的相似度排序(降序),当K》最近邻物品数,用最低的评分数来替代如:原本为(4,3,5).当K=7时,则为(3,3,5,4,4,3,5)题目:用余弦相似度(系数矩阵)计算得出与i1相似的物品为10,6,2,8,4,9,3,7,5(降序)(5,7,3,9,4)如果k=3则为3,9,4,之所以这么选择是因为他们被用户u3评论,u3评分过的只有3,4,5,7,9,若K=7,评分为(3,3,5,4,4,3,5),,因为评过的只有5个,给出的评分为5,4,4,3,5,剩余的两个用最低分填充,3,3,5,4,4,3,5.出现用户之间相同的随机性,则以评分随机为准则1.按相关性的降低排序,使得这与物品间的相似性比预测评分更有效2.只选取K个最高相似的,所以更精确3.2评分预测为什么用灰预测模型:mainlyfocusesonmodeluncertaintyandinformationinsufficiencywhenanalyzingandunderstandingsystemsviaresearchonconditionalanalysis,prediction,anddecisionmaking.Arecommendersystemcanbeconsideredasagreysystem;further,withouralgorithm,theGFmodelisusedtoyieldtheratingprediction.TheGFmodelutilizesaccumulatedgenerationoperationstobuilddifferentialequations,whichbenefitfromthedatacorrelations.Meanwhile,ithasanothersignificantcharacteristicwhereinitrequireslessdatasoitcanovercomethedatasparsityproblem.Theratingsequencegeneratedintheratingpreprocessingstageistheonlyinputrequiredformodelconstructionandsubsequentforecasting.步骤1:设定原始的评分序列为:K为最近邻物品序列步骤2:是通过的如下累加生成:这一步是最为重要的,例如:Forexample,isauser’soriginalratingsequence.Obviously,thesequencedoesnothaveaclearregularity.IfAGOisappliedtothissequence,is
本文标题:灰色预测模型对准确推荐在存在数据稀疏和相关性
链接地址:https://www.777doc.com/doc-2215937 .html