您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 招聘面试 > Criteriaoftests-4
CriteriaofTests测试的标准•Validity效度•Reliability信度•Power/Difficulty难度•Discrimination区分度•Practicality实用性•Backwasheffects后效作用CriteriaoftestsValidityThevalidityofatestistheextenttowhichitmeasureswhatitissupposedtomeasureandnothingelse.效度是指一套测试所考的是否就是设计人想要考的内容,或者说,在多大程度上考了想要考的。Discussonthefollowingitems:•“Isphotographyanartorascience?”Discuss.•“Themindisinitsownplace,anditselfcanmakeaHeavenofHell,aHellofaHeaven.”(Milton)Discuss.Usethefollowingwordsinsentences:courageous,choosy,acceptable,complicated,etc.A.Johnisaverycourageousboy.B.John,thecaptainofourteam,iscourageous.C.Ihaveacourageousfather.Factorsofvalidity•Facevalidity表面效度•Contentvalidity内容效度•Constructvalidity结构效度•Empiricalvalidity实验效度•Concurrentvalidity共时效度•Predictivevalidity预测效度Facevalidity•Ifatestitemlooksrighttoothertesters,teachers,moderators,andtestees,itcanbedescribedashavingatleastfacevalidity.•表面效度指考试表面的可信度或公众的可接受程度。•邹申:一个考试看上去具有了拟定的技能或能力测试。(测语音语调用笔头考试来测则表面效度低。)Contentvalidity•Atestissaidtohavecontentvalidityifitscontentconstitutesarepresentativesampleofthelanguageskill,structures,etc.withwhichitismeanttobeconcerned.•内容效度指测试是否考了考试大纲规定要考的,或者说考试的题目在多大程度上能代表它所要测量的目标。(1)Isthecontentofatestrelatedtotheobjectiveorpurposeofit?(2)Arethetestitemsrepresentative?(3)Isthecontentappropriateorsuitableforthetestees?Constructvalidity•Ifatesthasconstructvalidity,itiscapableofmeasuringcertainspecificcharacteristicsinaccordancewithatheoryoflanguagebehaviorandlearning.•结构(构卷)效度指测试是否以有效的语言观(包括语言学习观和语言运用观)为依据。这里的结构并不是指试卷的结构或题目的编排,而是指整个考试的理论基础。Empiricalvalidity•Thisvalidityisobtainedasaresultofcomparingtheresultsofthetestwiththeresultsofsomecriterionmeasure.•实验(统计)效度是将考试结果与其它测量结果相比较而得来的。它又可分为共时效度和预测效度。ConcurrentvalidityIftheresultsofthetestarecomparedwiththeresultsofsomecriterionmeasuresuchas:—anexistingtest,knownorbelievedtobevalidandgiven;or—theteacher’sratingsoranyothersuchformofindependentassessmentgivenatthesametime,thenresultsobtainedbyeitheroftheabovetwomethodsaremeasuresofthetest’sconcurrentvalidityinrespectoftheparticularcriterionused.•Inotherwords,concurrentvalidityisestablishedwhenthetestandthecriterionareadministeredataboutthesametime.•共时效度是将一次测试的结果同另一次同时或时间相近的测试的结果相比较,或同教师对学生的评估相比较而得出的系数。例如拿期末考试成绩与刚刚结束的四级考试成绩相比,假若得分情况相似,则说明期末测试有较高的共时效度。(前提:四级考试效度很高。)PredicativevalidityIftheresultsofthetestarecomparedwiththeresultsofsomecriterionmeasuresuchas:—thesubsequentperformanceofthetesteesonacertaintaskmeasuredbysomevalidtest;or—theteacher’sratingsoranyothersuchformofindependentassessmentgivenlater,thenresultsobtainedbyeitherofthesetwomethodsaremeasuresofthetest’spredicativevalidityinrespectoftheparticularcriterionused.•Inotherwords,predicativevalidityconcernsthedegreetowhichatestcanpredictthetesters’futureperformanceorsuccess.•预测效度涉及测试的预测能力,即测试结果到底在多大程度上能够预测出某些将来会发生的可能性,或者说考试是否具有预测学生未来表现或成绩的功能。ATestissaidtobereliableifitisconsistentinitsmeasurements.信度是指考试结果的可靠性和稳定性。例如拿一份卷子对同一组学生实施两次或多次测试,如果结果很一致,则说明该测试的信度较高。Reliability验证测试信度的方法•考后复考法(test/retestmethod)•试题分半法(split-halfmethod)•平行试题法(parallelformsmethod)test/retestmethodThismethodistore-administerthesametestafteralapseoftime.Itisoftenimpracticablesincecertainstudentswillbenefitmorethanothersbyafamiliaritywiththetypeandformatofthetest.Moreover,inadditiontochangesinperformanceresultingfromthememoryfactor,personalfactorssuchasmotivationanddifferentialmaturationwillalsoaccountfordifferencesintheperformancesofcertainstudents.split-halfmethodThismethodestimatesadifferentkindofreliabilityfromthatestimatedbytest/re-testprocedure.Itisbasedontheprinciplethat,ifanaccuratemeasuringinstrumentwerebrokenintotwoequalparts,themeasurementsobtainedwithonepartwouldcorrespondexactlytothoseobtainedwiththeother.parallelformsmethodThismethodistoadministerparallelformsofthetesttothesamegroup.Thisassumesthattwosimilarversionsofaparticulartestcanbeconstructed:suchtestsmustbeidenticalinthenatureoftheirsampling,difficulty,length,rubrics,etc.onlyafterafullstatisticalanalysisofthetestsandalltheitemscontainedinthemcanthetestssafelyberegardedasparallel.Ifthecorrelationbetweenthetwotestsishigh,thenthetestscanbetermedreliable.影响考试信度的因素•题量•题目性质•题目区分度•成绩分布•题目难度•评分是否客观•考试的时间Power/Difficulty难度是指一套试题中每个题目的难易程度。分析一套试卷的质量如何,除了看其信度和效度这两个重要指标之外,还要研究试题的难度指数(indexofdifficulty/facilityvalue),即试题的难易度。难度值的计算公式题目的难度通常用P来表示,P值实际上指的是答对题目的比率。假设有10名考生,某道题有8人答对,那么该题的难度值为:适用于主观性试题的公式假设某写作题的满分为20分,所有考生在这道题上的得分的平均分为16分,则该题的难度值为:正态分布图DiscriminationDiscriminationofatestisitscapabilitytodiscriminateamongthedifferentcandidatesandtoreflectthedifferencesintheperformanceoftheindividualsinthegroup.区分度指一个题目区分考生能力的程度。计算题目区分度的方法•公式法•点双列相关系数法•双列相关系数法PracticalityAgoodtestispractical.Itiswithinthemeansoffinanciallimitations,timeconstraints,easeofadministration,andscoringandinterpretation.实用性是指试题是否便于使用以及实施起来是否可行。Factorsaffectingpracticality•thelengthoftimeavailablefortheadministrationofthetest•theanswersheetandthestationeryused•thetestsituation•thenecessaryequipment•thepresentationofthetestpaperBackwasheffectsThetermbackwash(alsosometimesreferredtoaswashback)referstotheeffectsofatestonteachinga
本文标题:Criteriaoftests-4
链接地址:https://www.777doc.com/doc-2907344 .html