您好,欢迎访问三七文档
当前位置:首页 > 行业资料 > 其它行业文档 > bioconductor_zhangyong
BioConductorYongZhangNov4,2009AffymetrixTerminology Eachgeneorportionofageneisrepresentedby11to20oligonucleotidesof25base-pairs. Probe:anoligonucleotideof25base-pairs,i.e.,a25-mer. Perfectmatch(PM):A25-mercomplementarytoareferencesequenceofinterest(e.g.,partofagene). Mismatch(MM):sameasPMbutwithasinglehomomericbasechangeforthemiddle(13th)base(transversionpurine-pyrimidine,G-C,A-T). Probe-pair:a(PM,MM)pair. Probe-pairset:acollectionofprobe-pairs(11to20)relatedtoacommongeneorfractionofagene. AffyID:anidentifierforaprobe-pairset. ThepurposeoftheMMprobedesignistomeasurenon-specificbindingandbackgroundnoise.Affymetrixfiles MainsoftwareforlowlevelanalysisRMA DATfile:Imagefile,~10^7pixels,~50MB. CELfile:Cellintensityfile,probelevelPMandMMvalues. CDFfile:ChipDescriptionFile.Describeswhichprobesgoinwhichprobesetsandthelocationofprobe-pairsets(genes,genefragments,ESTs). CustomCDFfiles: MakeaworkingdirectoryonyourcomputerC:\GenomeInformaticsLab Copythe“.cel”filestoyourworkingdirectoryThedatacanberetrievedfrom: InstallBioConductorsource()biocLite() SettheworkingdirectoryinRastheonejustcreatedsetwd(C:\GenomeInformaticsLab)Orchangeitthrough“File-Changedir”Loadlibraries Loadthe‘affy’librarylibrary(affy)Readdata Makeaphenotypefileforthedata Readdatapd-read.AnnotatedDataFrame(er_experiment.txt,header=T,row.names=1,as.is=T)Readtherawdatafrom.celfilesfiles-dir(pattern=.cel)#theorderhereshouldbeconsistentwithphenotypeAffyData-read.affybatch(filenames=files,phenoData=pd)NameT00.1T00.2T00.3T03.1T03.2T03.3QualityControl Viewarrayimage:image(AffyData[,1])boxplot(AffyData) Histogramofrawprobeintensities:hist(log2(pm(AffyData[,1])),breaks=100,col=blue) MVAplots(normalizationqualitycheck)mva.pairs(pm(AffyData)[,c(1,3,5)])GetcustomCDFfiles CheckthecurrentCDFAffyData@cdfName GetcustomCDFfilesUMRepos-getOption(repositories2”)UMRepos[UMRepository]='(‘repositories2’=UMRepos)AffyData@cdfName-Hs133P_Hs_REFSEQExpressionindexanalysis CalculategeneexpressionindexRobustMulti-chipAnalysis(RMA)exprsSet.RMA-rma(AffyData)exp.RMA-exprs(exprsSet.RMA)boxplot(data.frame(exp.RMA))write.exprs(exprsSet.RMA,file='ER_expression.rma')MAS5exprsSet.MAS5-mas5(AffyData)PACalls-mas5calls(AffyData)write.exprs(PACalls,file='ER_PAcalls.list')Foldchangecalculation Getgenenamesintheexperimentgenes-matrix(rownames(exp.RMA)) Calculatefoldchangetime0-1:3time3-4:6foldchange-apply(exp.RMA,1,function(x)mean(x[time3])-mean(x[time0]))Hypothesistesting Performt-testpvalue-apply(exp.RMA,1,function(x)t.test(x[time3],x[time0],var.equal=T)$p.value) PerformWelcht-testWelch.pvalue-apply(exp.RMA,1,function(x)t.test(x[time3],x[time0],var.equal=F)$p.value) Multipletestscorrectionfdr-p.adjust(pvalue,method=fdr)Availablemethods:‘Bonferroni’,‘BH’(BenjaminiandHochberg)‘fdr’,etal.Finalresultandoutput Getgenelistwewantedgenes.up-genes[which(fdr0.05&foldchange0)]genes.down-genes[which(fdr0.05&foldchange0)] Exportresultforotheruseswrite.table(genes.up,file=upgene.xls,sep=\t,quote=FALSE,col.names=NA)TheDatabaseforAnnotation,VisualizationandIntegratedDiscovery(DAVID) Identifyenrichedbiologicalthemes,particularlyGOterms Discoverenrichedfunctional-relatedgenegroups Clusterredundantannotationterms VisualizegenesonBioCarta&KEGGpathwaymaps Displayrelatedmany-genes-to-many-termson2-Dview Searchforotherfunctionallyrelatedgenesnotinthelist Listinteractingproteins Exploregenenamesinbatch Linkgene-diseaseassociations Highlightproteinfunctionaldomainsandmotifs Redirecttorelatedliteratures Convertgeneidentifersfromonetypetoanother. AndmoreDAVID Total:1,000paperscitingDAVID DailyUsage:~1200genelists/sublistsfrom~400uniqueresearchers. TotalUsage:~800,000genelists/sublistsfrom5,000researchinstitutesworld-wideWheretostartDatasubmissionGethelpThankyou
本文标题:bioconductor_zhangyong
链接地址:https://www.777doc.com/doc-6425614 .html