灰色预测模型对准确推荐在存在数据稀疏和相关性

clusteringCFmodelsBayesianbeliefnets(BNs)CFmodelsMarkovdecisionprocessbased(MDP-based)CFmodelslatentsemanticCFmodels利用降维技术来处理数据稀疏性的问题（SVD）丢失关键数据本论文中是利用simplestmethod（CosineDistancemeasurementmethod）,来处理原理：wedonotdirectlyusetheexactvalueofthesimilarities,butratherranktheitemsaccordingtotheirsimilarities可以应用的领域：suchasﬁnance[23],integratedcircuitindustry[24],themarketforairtravel[25],andundergroundpressureforworkingsurface[26].实验的数据集：MovieLensandEachMovie论文构架：2部分是对传统CF方法的描述，基于CF(ICF)methods，对存在问题的描述，本人的贡献3部分详细描述了基于算法提出的GF模型4部分描述了实验的研究，包括实验的数据集，评估的度量，方法，实验的分析，总结和将来的工作。2.主要的工作可以分为两个大的部分：1.相似度的测量和2.评分的预测相似度的测量方法：1.InICFmeth-ods,thesimilaritysðix;iyÞbetweentheitemsix,andiyisdeterminedbytheuserswhohaveratedboththeitems2.最流行的方法：余弦距离和皮尔逊相关，运算原理：letIbethesetofallitemsratedbyboththeusersux,anduy,andletUbethesetofalluserswhohaveratedtheitemsix,andiy例如：题目：(itemsetI)是BreadandMilk，disequaltothesizeofsetI.Inthiscase,disequaltotwo（d=2)Cake(ix)andMilk(iy)areratedbybothAliceandLucy(usersetU)2.相似度的计算方法2.1.1余弦距离用来计算两个向量之间的相似度ForUCF,thesimilaritybetweentwouserswithCosineDistancemethodcanbecalculatedasfollows:CosineDistance用户之间的相似度是用户ux和用户uy对项目i的评分，ForICF则为：物品之间的相似度是用户u对项目ix，iy的评分2.1.2皮尔逊相关系数在相似度的计算过程中，消除评分相关性，可以利用平均评分来消除，皮尔逊相关系数在一定的程度上提高了相似度计算的准确度，对于用户之间的相似度计算如下：是用户对所有电影评分的平均值对于物品的计算则如下：2.2评分预测思路：ThekNearestNeighbors(KNN)method[37]isusuallyusedforpredictionbyweightingthesumoftheratingsthatsimilarusersgivetothetargetitemortheratingsoftheactiveuseronsimilaritemsdependingonwhetherUCForICFisused2.2.1用户之间思想：isbasedonthebasicassumptionthatpeoplewhosharesimilarpastpreferenceswillbeinterestedinsimilaritems.算法步骤：ﬁrst,thesimilaritiesbetweentheusersarecomputedusingsimilaritymeasurementmethodsintroducedinSection2.1;then,thepredictionfortheactiveuserisdeterminedbytakingtheweightedaverageofalltheratingsofthesimilarusersforacertainitem[37]accordingtotheformulainEq.(5);ﬁnally,theitemswiththehighestpre-dictedratingswillberecommendedtotheuserwhereU(ux)denotesthesetofuserssimilartotheuserux,andpux;iisthepredictionfortheuseruxonitemi2.2.2物品之间思想：algorithmrecommendsitemstousersthataresimilartotheitemsthattheyhavealreadyconsumed.Similarly,aftercal-culatingthesimilaritiesbetweentheitems,whereI(ix)denotesthesetofsimilaritemsofitemix.Further,pu;ixdenotesthepredictionofuseruonitemix.2.3问题分析从数据的稀疏性和数据的相关性不用数据本身，即就是对数据本身排序使用本文的重点：weonlyranktheitemsaccord-ingtothesimilarity.用户的相似度和物品相似度，不是去利用计算出来的相似度，而是利用计算出来的排序相似度本身存在误差Then,togeneratethepredictionoftheactiveuseruonitemi,thekmostsimilaritemsthathavebeenratedbytheactiveuseronitemiareselected.Finally,weusetheseitemsastheinputtobuildaGFmodelandpredicttheratingoftheactiveuseruonitemi.Iftheuserudoesnotratekitems,aﬁxedvaluewillbeusedtocompletethekratings.Empirically,theﬁxedvaluecanbethemedianvalueoftheratingscale.Forexample,whentheratingscaleis1–5,thenumber3isselectedastheﬁxedvalue.Theproposedmethodprovidesthefollowingthreemaincontributions:优点：1.Overcomingdatasparsity2.Beneﬁtingfromdatacorrelation3.Obtainingaccuratepredictions.3.提出算法思想：ratingsofsimilarusersforatargetitemorratingsoftheactiveuserforsimilaritemstogenerateprediction。Inthispaper,theGFmodelisusedforratingprediction.Itinvolvestwosteps:ratingpreprocess-ingandratingprediction.3.1.Ratingpreprocessing利用物品之间的相似度来产生评分的预测，算法步骤：First,forsimplicity,theCosineDistancemethodisutilizedtocomputethesimilaritybetweentwoitems.Then,anmmsimilaritymatrixisgenerated,wheremisthenumberofitems.Ifwewanttopredicttheunratedentryoftheuseruonitemiintheratingmatrix,thekmostsimilaritemstotheitemithathavebeenratedbytheuseruareselected.Notethatwhentheuserudoesnotratekitems,theﬁxedvaluewiththelowestsimilaritywillbeusedtocompletethekratings.Finally,thekratingsaresortedaccordingtotheirincrementalsimilaritiestotheitemitoproducearatingsequence.Inthenextstep,thepro-posedalgorithminputstheratingsequencetotheGFmodelandforecaststheratingthattheuseruwillgivetoitemi.计算出物品之间的思想度后，把物品之间的相似度排序（降序），当K》最近邻物品数，用最低的评分数来替代如：原本为(4,3,5).当K=7时，则为(3,3,5,4,4,3,5)题目：用余弦相似度（系数矩阵）计算得出与i1相似的物品为10,6,2,8，4,9,3，7,5（降序）（5,7,3,9,4）如果k=3则为3,9,4，之所以这么选择是因为他们被用户u3评论，u3评分过的只有3,4,5,7,9，若K=7，评分为(3,3,5,4,4,3,5),，因为评过的只有5个，给出的评分为5,4,4,3,5，剩余的两个用最低分填充，3,3,5,4,4,3,5.出现用户之间相同的随机性，则以评分随机为准则1.按相关性的降低排序，使得这与物品间的相似性比预测评分更有效2.只选取K个最高相似的，所以更精确3.2评分预测为什么用灰预测模型：mainlyfocusesonmodeluncertaintyandinformationinsufficiencywhenanalyzingandunderstandingsystemsviaresearchonconditionalanalysis,prediction,anddecisionmaking.Arecommendersystemcanbeconsideredasagreysystem;further,withouralgorithm,theGFmodelisusedtoyieldtheratingprediction.TheGFmodelutilizesaccumulatedgenerationoperationstobuilddifferentialequations,whichbenefitfromthedatacorrelations.Meanwhile,ithasanothersignificantcharacteristicwhereinitrequireslessdatasoitcanovercomethedatasparsityproblem.Theratingsequencegeneratedintheratingpreprocessingstageistheonlyinputrequiredformodelconstructionandsubsequentforecasting.步骤1：设定原始的评分序列为：K为最近邻物品序列步骤2：是通过的如下累加生成：这一步是最为重要的，例如：Forexample,isauser’soriginalratingsequence.Obviously,thesequencedoesnothaveaclearregularity.IfAGOisappliedtothissequence,is

灰色预测模型对准确推荐在存在数据稀疏和相关性

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

Lesson 26 Wanted a large biscuit tin 征购大饼干筒

Oracle 企业管理器（OEM 2.1）使用指南

搭建企业系统化知识管理系统的方法与路径

中国通信市场一周热点回顾 0718-0722

宜梧国中药物滥用防制实施办法

石材合同60

QCC概述与圈的组成

【培训课件】iso14000标准培训教程

前期策划报告框架(整合版)

【研究生科技】风险投贺的两个本质特征及其实践应用证明

相关文档

相关搜索