您好,欢迎访问三七文档
当前位置:首页 > 行业资料 > 其它行业文档 > matlab解决svr代码
多元线性回归和BP神经网络及决策向量机之间的比较,个人理解:多元线性回归:就是多个属性的线性组合,在组合时,通过不断调节每个属性的权重来使多元线性回归函数更多的适用于多个样本。BP神经网络:通过使用最快速下降法,通过反向传播来不断调整网络中的权值和阈值,使网络的误差平方和最小。决策向量机:它仍是对每个样本操作,使得所有样本距离最终生成的拟合曲线的间隔最小化。算法比较:BP目标函数:权值调整:决策向量机目标函数:min1/2w^2支持向量机(Supportvectormachines,SVM)与神经网络类似,都是学习型的机制,但与神经网络不同的是SVM使用的是数学方法和优化技术。学习效率的比较:导入数据:File-importdata参数优化常用方法:[train_pca,test_pca]=pcaForSVM(train_data,test_data,97);//主元分析[bestCVmse,bestc,bestg,ga_option]=gaSVMcgForRegress(train_label,train_pca);[bestmse,bestc,bestg]=SVMcgForRegress(train_label,train_data)cmd=['-c',num2str(bestc),'-g',num2str(bestg),'-s3-p0.01'];train_label=data(1:50,1);train_data=data(1:50,2:14);model=svmtrain(train_label,train_data,'-s3-t2-c2.2-g2.8-p0.01');test_label=data(51:100,1);test_data=data(51:100,2:14);[predict_label,mse,dec_value]=svmpredict(test_label,test_data,model);[bestmse,bestc,bestg]=SVMcgForRegress(train_label,train_data)cmd=['-c',num2str(bestc),'-g',num2str(bestg),'-s3-p0.01'];代码整理:Part1:从核函数的角度出发,当选取不同核函数类型时,模型的效率是否有所提高1.核函数为RBF核函数时:mpjjmjdyJ12)(2111kijkijwJw优化前:train_label=data(1:50,1);train_data=data(1:50,2:14);model=svmtrain(train_label,train_data,'-s3-t2-c2.2-g2.8-p0.01');[predict_label,mse,dec_value]=svmpredict(train_label,train_data,model);%上一行利用自身的值和预测值进行比较,求得模型实际结果和预测结果的均方值test_label=data(51:100,1);test_data=data(51:100,2:14);[predict_label,mse,dec_value]=svmpredict(test_label,test_data,model);优化后:train_label=data(1:50,1);train_data=data(1:50,2:14);[bestmse,bestc,bestg]=SVMcgForRegress(train_label,train_data)%优化方法暂定为网格寻优cmd=['-c',num2str(bestc),'-g',num2str(bestg),'-s3–t2-p0.01'];model=svmtrain(train_label,train_data,cmd);[ptrain,mse,dec_value]=svmpredict(train_label,train_data,model);figure;%画图比较预测值和实际值subplot(2,1,1);plot(train_label,'-o');holdon;plot(ptrain,'r-s');gridon;legend('original','predict');title('TrainSetRegressionPredictbySVM');2.核函数为多项式核函数时train_label=data(1:50,1);train_data=data(1:50,2:14);[bestmse,bestc,bestg]=SVMcgForRegress(train_label,train_data);cmd=['-c',num2str(bestc),'-g',num2str(bestg),'-s3-t1-p0.01'];model=svmtrain(train_label,train_data,cmd);[ptrain,mse]=svmpredict(train_label,train_data,model);figure;%画图比较预测值和实际值subplot(2,1,1);plot(train_label,'-o');holdon;plot(ptrain,'r-s');gridon;legend('original','predict');title('TrainSetRegressionPredictbySVM');Meansquarederror=14505.6(regression)Squaredcorrelationcoefficient=0.349393(regression)3.核函数为线性乘积0--linear:u'*vtrain_label=data(1:50,1);train_data=data(1:50,2:14);[bestmse,bestc,bestg]=SVMcgForRegress(train_label,train_data);cmd=['-c',num2str(bestc),'-g',num2str(bestg),'-s3-t0-p0.01'];model=svmtrain(train_label,train_data,cmd);[ptrain,mse]=svmpredict(train_label,train_data,model);figure;%画图比较预测值和实际值subplot(2,1,1);plot(train_label,'-o');holdon;plot(ptrain,'r-s');gridon;legend('original','predict');title('TrainSetRegressionPredictbySVM');Meansquarederror=14537(regression)Squaredcorrelationcoefficient=0.389757(regression)4.核函数为sigmoid:tanh(gamma*u'*v+coef0)神经元的非线性作用函数train_label=data(1:50,1);train_data=data(1:50,2:14);[bestmse,bestc,bestg]=SVMcgForRegress(train_label,train_data);cmd=['-c',num2str(bestc),'-g',num2str(bestg),'-s3-t3-p0.01'];model=svmtrain(train_label,train_data,cmd);[ptrain,mse]=svmpredict(train_label,train_data,model);figure;%画图比较预测值和实际值subplot(2,1,1);plot(train_label,'-o');holdon;plot(ptrain,'r-s');gridon;legend('original','predict');title('TrainSetRegressionPredictbySVM');Meansquarederror=24326.5(regression)Squaredcorrelationcoefficient=0.271859(regression)下图为江良学长的测试成本-因素结果注意:第一部分在建模时仅采用的是前50组数据生成的测试效率-因素模型,当选取的训练集越多(接近100)时,他的效果是越差的,举例说明如下:核函数为RBFMeansquarederror=20424.8(regression)Squaredcorrelationcoefficient=0.527831(regression)选取的样本越多,得到的MSE越大(虽然mse增加,但对样本的预测效果肯定会更好,因为随着样本数增加,学习能力肯定提高),而相关系数反而有所提高(接近于1最佳);问题提出:为什么bestmse=2.3162e+004与实际训练出来的Meansquarederror=20424.8(regression)相距甚选????Part2:从参数优化方法选取不同上比较那种参数选取更优此比较基于RBF核函数而言1.基于网格寻优方法代码:train_label=data(1:50,1);train_data=data(1:50,2:14);[bestmse,bestc,bestg]=SVMcgForRegress(train_label,train_data)%优化方法暂定为网格寻优cmd=['-c',num2str(bestc),'-g',num2str(bestg),'-s3–t2-p0.01'];model=svmtrain(train_label,train_data,cmd);[ptrain,mse,dec_value]=svmpredict(train_label,train_data,model);结果:bestmse=1.5542e+004bestc=27.8576bestg=0.0039Meansquarederror=14107.4(regression)Squaredcorrelationcoefficient=0.386814(regression)2.基于遗传算法寻优train_label=data(1:50,1);train_data=data(1:50,2:14);[bestCVmse,bestc,bestg,ga_option]=gaSVMcgForRegress(train_label,train_data)cmd=['-c',num2str(bestc),'-g',num2str(bestg),'-s3–t2-p0.01'];model=svmtrain(train_label,train_data,cmd);[ptrain,mse,dec_value]=svmpredict(train_label,train_data,model);结果:bestCVmse=1.8944e+004bestc=59.5370bestg=778.3573ga_option=maxgen:200sizepop:20ggap:0.9000cbound:[0100]gbound:[01000]v:5Meansquarederror=10426.1(regression)Squaredcorrelationcoefficient=0.622133(regression)3.基于pso寻优(在这里使用启发式算法PSO来进行参数寻优,用网格划分(gridsearch)来寻找最佳的参数c和g,虽然采用网格搜索能够找到在CV意义下的最高的分类准确率,即全局最优解,但有时候如果想在更大的范围内寻找最佳的参数c和g会很费时,采用启发式算法就可以不必遍历网格内的所有的参数点,也能找到全局最优解)代码:train_label=data(1:50,1);tra
本文标题:matlab解决svr代码
链接地址:https://www.777doc.com/doc-6387684 .html