您好,欢迎访问三七文档
1、AGeneralCoefficientofSimilarityandSomeofItsPropertiesJ.C.GowerBiometrics,Vol.27,No.4.(Dec.,1971),pp.857-871.StableURL:=0006-341X%28197112%2927%3A4%3C857%3AAGCOSA%3E2.0.CO%3B2-3BiometricsiscurrentlypublishedbyInternationalBiometricSociety.YouruseoftheJSTORarchiveindicatesyouracceptanceofJSTOR'sTermsandConditionsofUse,availableat://@jstor.org.:352007BIOMETRICS$7,857-74December1971AGENERALCOEFFICIENTOFSIMILARITYANDSOMEOFITSPROPERTIESJ.C.GOWERRothamstedExperimentalStation,Ha~penden,Herts.,U.R.SUMMA。
2、RYAgeneralcoefficientmeasuringthesimilaritybetweentwosamplingunitsisdefined.Thematrixofsimilaritiesbetweenallpairsofsampleunitsisshowntobepositivesemi-definite(exceptpossiblywhentherearemissingvalues).Thisisimportantforthemulti-dimensionalEuclideanrepresentationofthesampleandalsoestablishessomeinequalitiesamongstthesimilaritiesrelatingthreeindividuals.Thedefinitionisextendedtocopewithahierarchyofcharacters.1.INTRODUCTIONAsimilaritycoefficientmeasurestheresemblancebetweentwoindividualsbasedoneith。
3、erorbothoftwologicallydistinctkindsofinformationpertainingtovvariablesandallowingforpossiblemissinginformation.Firstthereisinformationontheexistence,ornot,ofthevariables.Intaxonomy,wheresimilaritycoefficientsareoftenused,thismaybetheonlykindofinformationusedtobuildupataxonomicclassification.Thetaxonomisthastheproblemofdecidingwhetheracharacteroccurringinonegroupoforganismsalsooccursinanothergroup;thisistheso-calledhomologyproblem.Amissingcharactershouldnotbeconfusedwithmissinginformationbecausei。
4、tisknownthatthecharacterdefinitelydoesnotexist.Missinginformationcanoccur,forexample,withincompletefossilmaterialorwithpoordescriptionsintheliterature,fromwhichtheexistenceorother-wiseofacharactercannotbeinferred.Theothertypeofinformationpertainstoobservedvaluesofqualitativeorquantitativepropertiesofexistingcharacters.Anabsentcharactercannothaveanyassociatedpropertiesandthissuggeststhatthetwotypesofin-formationmightbeviewedhierarchically,atopicreturnedtoinsection4.Acommonsimplesituationoccurswhe。
5、nallinformationisofthepresence/absencetype(orfrom2-levelqualitativecharacters).Thisgivesthefamiliar2X2associationtableshowninTable1,wherepresenceisdenotedby+andabsenceby-.ManydifferentcoefficientshavebeenderivedfromTable1.Yule'searlyworkonthissubjectwasreviewedbyYates[1952].MorerecentlySokalandSneath[I9631discussednumerousassociationcoefficients,notallofwhichhaveyetbeenused.Wearenotconcernedherewithrecommending858BIOMETRICS,DECEMBER1971TABLE1NUMBERSOFCHARACTERSOCCURRINGIN,ORABSENTFROM,TWOINDIVID。
6、UALS:a(+,+)COMMONTOBOTHINDIVIDUALS;b(-,+)ANDC(f,-)OCCURRINGINONLYONEINDIVIDUAL;ANDd(-,-)ABSENTFROMBOTHIndividual1f-TotalsIndividual2+abafb-cdcfdTotalsa+cbfdvwhatcoefficientsshouldbeusedindifferentcircumstancesbutmerelywishtodescribeageneralcoefficientthatincludesseveralexistingonesasspecialcases,andcanthereforebeusedundermanydifferentcircumstances.Itisparticularlysuitableforincludingincomputerprogramsbecauseitcancopewithavarietyofdifferentdata-typeswithoutanyreprogrammingandalsobecausethepositiv。
7、esemi-definitepropertyestablishedinsection3isaprerequisiteforcertaintypesofstatisticalandnumericalanalyses(Gower[1966]).Thiscoefficienthasbeenusedsince1960invariouscomputerprograms.Tofindouthowithasbehavedthereaderisreferredtotheasteriskedref-erencesgivenattheendofthispaper.2.THEDEFINITIONOFSIMILARITY2.1.TerminologyDichotomous,qualitative,andquantitativevariatesaredistinguished.Thetermdichotomousisreservedforcharactersthatareeitherpresentorabsentandwhoseabsenceinbothofapairofindividualsisnottake。
8、nasamatch;whenbothlevelsofatwo-levelqualitativevariatearetobetreatedonapar,thelevelswillbetermedalternatives.Adiscussionofsomeoftheconsiderationsgoverningthechoiceofscoringthetwolevelsofaresponseasdichotomousorasalternativesisdeferreduntilsection4.Qualitativecharactersmayhavemanylevels(e.g.black,green,yellow,blue)butunlikethelevelsofquanti-tativecharacterstheydonotformanorderedset,althoughforconvenienceincomputing,codednumericalvaluesmaybegiven.2.2.ThecalculationofsimilarityTwoindividualsiandjma。
9、ybecomparedonacharacterkandassignedascoresilk,zerowheniandjareconsidereddifferentandapositivefraction,orunity,whentheyhavesomedegreeofagreementorsimilarity.Therearemanywaysofcalculatingsiik,someofwhicharedescribedbelow.Some-AGENERALCOEFFICIENTOFSIMILARI。
本文标题:1971 A General Coefficient of Similarity and Some
链接地址:https://www.777doc.com/doc-3256101 .html