您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 咨询培训 > DataStage入门培训
DataStage入门培训讲师:邱明伟日期:2010-03-01东南融通版权所有DataStage2AgendaDataStage介绍DataStage开发1.DataStage四个客户端的使用2.DataStage常用组件使用DataStage常用命令练习DataStage介绍DataStage4AscentialPlatformDataStage5WhatisDataStage?DesignjobsforExtraction,Transformation,andLoading(ETL)Idealtoolfordataintegrationprojects–suchas,datawarehouses,datamarts,andsystemmigrationsImport,export,create,andmanagedmetadataforusewithinjobsSchedule,run,andmonitorjobsallwithinDataStageAdministeryourDataStagedevelopmentandexecutionenvironmentsDataStage开发DataStage7DataStageServerandClientsDataStage8DataStageServerandClientsAdministratorAdministersDataStageprojectsandconductshousekeepingontheserverDesignerCreatesDataStagejobsthatarecompiledintoexecutableprogramsDirectorUsedtorunandmonitortheDataStagejobsManagerAllowsyoutoviewandeditthecontentsoftherepositoryDataStageAdministratorDataStage10DataStageAdministratorInDataStagealldevelopmentworkisdonewithinaproject.ProjectsarecreatedduringinstallationandafterinstallationusingAdministrator.Eachprojectisassociatedwithadirectory.Thedirectorystorestheobjects(jobs,metadata,customroutines,etc.)createdintheproject.Beforeyoucanworkinaprojectyoumustattachtoit(openit).YoucansetthedefaultpropertiesofaprojectusingDataStageAdministratorDataStage11DataStageAdministratorUsetheAdministratortospecifygeneralserverdefaults,addanddeleteprojects,andtosetprojectproperties.UsetheAdministratorProjectPropertieswindowto:·SetjobmonitoringlimitsandotherDirectordefaultsontheGeneraltab.·SetusergroupprivilegesonthePermissionstab.·Enableordisableserver-sidetracingontheTracingtab.·SpecifyausernameandpasswordforschedulingjobsontheScheduletab.·SpecifyhashedfilestagereadandwritecachesizesontheTunablestabDataStageManagerDataStage13DataStageManagerDataStageManagermanagestwodifferenttypesofobjects:·Metadatadescribingsourcesandtargets:-CalledtabledefinitionsinManager.Thesearenottobeconfusedwithrelationaltables.DataStagetabledefinitionsareusedtodescribetheformatandcolumndefinitionsofanytypeofsource:sequential,relational,hashedfile,etc.-TabledefinitionscanbecreatedinManagerorDesignerandtheycanalsobeimportedfromthesourcesortargetstheydescribe.DataStage14DataStageManager·DataStagecomponents-EveryobjectinDataStage(jobs,routines,tabledefinitions,etc.)isstoredintheDataStagerepository.Manageristheinterfacetothisrepository.-DataStagecomponents,includingwholeprojects,canbeexportedfromandimportedintoManager.DataStage15DataStageManagerAnyobjectinManagercanbeexportedtoafileCanexportwholeprojectsUseforbackupSometimesusedforversioncontrolCanbeusedtomoveDataStageobjectsfromoneprojecttoanotherUsetoshareDataStagejobsandprojectswithotherdevelopersDataStage16DataStageManagerImportProcedureInManager,click“ImportDataStageComponents”SelectDataStageobjectsforimportDataStage17DataStageManagerExportProcedureInManager,click“ExportDataStageComponents”SelectDataStageobjectsforexportSpecifiedtypeofexport:DSX,XMLSpecifyfilepathonclientmachineDataStageDirectorDataStage19DataStageDirectorCanschedule,validating,andrunjobsCanbeinvokedfromDataStageManagerorDesignerClearjoblogSetDirectoroptionsRowlimitsAbortafterxwarningsDataStage20DirectorLogViewClicktheLogbuttoninthetoolbartoviewthejoblog.Thejoblogrecordseventsthatoccurduringtheexecutionofajob.Theseeventsincludecontrolevents,suchasthestarting,finishing,andabortingofajob;informationalmessages;warningmessages;errormessages;andprogram-generatedmessages.DataStage21DataStageDirectorDataStageDesingerDataStage23WhatIsaJob?ExecutableDataStageprogramCreatedinDataStageDesigner,butcanusecomponentsfromManagerBuiltusingagraphicaluserinterfaceCompilesintoOrchestrateshelllanguage(OSH)DataStage24CreateNewJobSeveraltypesofDataStagejobs:Parallel–thiscoursewillconcentrateonparalleljobs.JobSequence–usedtocreatejobsthatcontrolexecutionofotherjobs.DataStage25CreateNewJobDataStage26ComponentsIntroduceSequentialfile功能特点:适用于一般顺序文件(定长或不定长),可识别文本文件或IBM大机ebcdic文件。使用要点:按照命名规范命名点住文件,双击鼠标,在general说明此文件内容,格式,存储目录等修改文件属性,文件名称,reject方式DataStage27SequentialfileDataStage28Sequentialfile修改文件格式,比如记录结束符是什么,字段分隔符,字符串是用什么区别等DataStage29SequentialfileDataStage30Sequentialfile输入此文件字段内容DataStage31Annotation功能特点:一般用于注释,可利用其背景颜色在job中分颜色区别不同功能块DataStage32AnnotationDataStage33CopyStage功能说明:CopyStage可以有一个输入,多个输出。它可以在输出时改变字段的顺序,但是不能改变字段类型。DataStage34CopyStageDataStage35FilterStage功能说明:FilterStage只有一个输入,可以有多个输出。根据不同的筛选条件,可以将数据输出到不同的outputlinkDataStage36FilterStageDataStage37SortStage功能说明:只能有一个输入及一个输出,按照指定的Key值进行排列。可以选择升序还是降序,是否去除重复的数据等等DataStage38SortStageDataStage39SortStageOption具体说明:AllowDuplicates:是否去除重复数据。为False时,只选取一条数据,当StableSort为True时,选取第一条数据。当SortUnility为UNIX时此选项无效。SortUtility:选择排序时执行应用程序,可以选择DataStage内建的命令或者Unix的Sort命令OutputStatistics:是否输出排序统计信息到job日志StableSort:是否对数据进行二次整理DataStage40SortStageCreateClusterKeyChangeColumn:是否为每条记录创建一个新的字段:clusterKeyChange。当SortKeyMode为Don’tSort(PreviouslySorted)或Don’tSort(PreviouslyGrouped)时,对于第一条记录该字段被设置为1,其余的记录设置为0。CreateKeyChangeColumn:是否为每一条记录创建一个新的字段KeyChangeDataStage41RemoveDuplicat
本文标题:DataStage入门培训
链接地址:https://www.777doc.com/doc-3369923 .html