EssayTools - Maple Programming Help

Home : Support : Online Help : Education : EssayTools : EssayTools/DiceCoefficient

EssayTools

 DiceCoefficient
 computes the dice coefficient of two arrays
 BinaryDiceCoefficient
 computes the binary dice coefficient of two arrays

 Calling Sequence DiceCoefficient( v1, v2 ) BinaryDiceCoefficient( v1, v2 )

Parameters

 v1, v2 - vector or list of integers

Description

 • The dice coefficient is a measure of similarity between two vectors.

$\frac{2\left(\mathrm{v1}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}.\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}\mathrm{v2}\right)}{{\left|\mathrm{v1}\right|}^{2}+{\left|\mathrm{v2}\right|}^{2}}$

 • In binary form, where the vectors contain 1's and 0's, the formula can be expressed in set notation.

$\frac{2\left|\mathrm{v1}\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}∩\phantom{\rule[-0.0ex]{0.5em}{0.0ex}}\mathrm{v2}\right|}{\left|\mathrm{v1}\right|+\left|\mathrm{v2}\right|}$

 • Both v1 and v2 must be the same size.
 • In the context of text comparison, v1 and v2 could be a count of the occurrences of certain words within two essay sets.  In the binary form v1 and v2 would contain 1 for the presence of a word, and 0 for its absence.
 • For positive integer counts, the Dice and BinaryDice coefficients will range from 0 to 1, where 1 is a perfect match, and 0 indicates no overlap.  The higher the score in-between, the more similar the vectors.
 • The Binary form of this command will accept any vector as input and interpret all non-zero entries as 1s.
 • This function is part of the EssayTools package, so it can be used in the short form SimilarityScore(..) only after executing the command with(EssayTools). However, it can always be accessed through the long form of the command by using EssayTools[SimilarityScore](..).

Examples

 > $\mathrm{with}\left(\mathrm{EssayTools}\right)$
 $\left[{\mathrm{AppendToWordList}}{,}{\mathrm{BinaryCosineCoefficient}}{,}{\mathrm{BinaryDiceCoefficient}}{,}{\mathrm{BinaryJaccardCoefficient}}{,}{\mathrm{BuildScoreModel}}{,}{\mathrm{CosineCoefficient}}{,}{\mathrm{CountMisspellings}}{,}{\mathrm{CountUseOfAllWords}}{,}{\mathrm{CountUseOfEachWord}}{,}{\mathrm{DetectPlagiarism}}{,}{\mathrm{DiceCoefficient}}{,}{\mathrm{GetWordList}}{,}{\mathrm{GetWordTable}}{,}{\mathrm{IsAdjective}}{,}{\mathrm{IsAdverb}}{,}{\mathrm{IsConjunction}}{,}{\mathrm{IsDefiniteArticle}}{,}{\mathrm{IsIndefiniteArticle}}{,}{\mathrm{IsInterjection}}{,}{\mathrm{IsIntransitiveVerb}}{,}{\mathrm{IsNominative}}{,}{\mathrm{IsNoun}}{,}{\mathrm{IsNounPhrase}}{,}{\mathrm{IsPlural}}{,}{\mathrm{IsPreposition}}{,}{\mathrm{IsPronoun}}{,}{\mathrm{IsTransitiveVerb}}{,}{\mathrm{IsUsuallyParticipleVerb}}{,}{\mathrm{IsVerb}}{,}{\mathrm{JaccardCoefficient}}{,}{\mathrm{Lemma}}{,}{\mathrm{Misspellings}}{,}{\mathrm{PartOfSpeech}}{,}{\mathrm{QuadraticWeightedKappa}}{,}{\mathrm{Reduce}}{,}{\mathrm{Score}}{,}{\mathrm{SetWordList}}{,}{\mathrm{SimilarityScore}}{,}{\mathrm{SpellCorrectWord}}{,}{\mathrm{WordUse}}\right]$ (1)
 > $\mathrm{DiceCoefficient}\left(\left[1,2,3\right],\left[1,2,3\right]\right)$
 ${1.}$ (2)
 > $\mathrm{BinaryDiceCoefficient}\left(\left[1,2,3\right],\left[1,2,3\right]\right)$
 ${1.}$ (3)
 > $\mathrm{DiceCoefficient}\left(\left[1,0,1\right],\left[0,1,0\right]\right)$
 ${0.}$ (4)
 > $\mathrm{BinaryDiceCoefficient}\left(\left[1,0,1\right],\left[0,1,0\right]\right)$
 ${0.}$ (5)
 > $\mathrm{DiceCoefficient}\left(\left[1,0,4\right],\left[0,1,1\right]\right)$
 ${0.4210526316}$ (6)
 > $\mathrm{BinaryDiceCoefficient}\left(\left[1,0,4\right],\left[0,1,1\right]\right)$
 ${0.5000000000}$ (7)
 > $\mathrm{Asimov}≔"The most exciting phrase to hear in science, the one that heralds new discoveries, is not \text{'}Eureka!\text{'} but \text{'}That\text{'}s funny...\text{'}":$
 > $\mathrm{Heisenberg}≔"It is not surprising that our language should be incapable of describing the processes occurring within the atoms, for, as has been remarked, it was invented to describe the experiences of daily life, and these consist only of processes involving exceedingly large numbers of atoms. Furthermore, it is very difficult to modify our language so that it will be able to describe these atomic processes, for words can only describe things of which we can form mental pictures, and this ability, too, is a result of daily experience. Fortunately, mathematics is not subject to this limitation, and it has been possible to invent a mathematical scheme - the quantum theory - which seems entirely adequate for the treatment of atomic processes; for visualization, however, we must content ourselves with two incomplete analogies - the wave picture and the corpuscular picture.":$
 > $\mathrm{Born}≔"The ultimate origin of the difficulty lies in the fact \left(or philosophical principle\right) that we are compelled to use the words of common language when we wish to describe a phenomenon, not by logical or mathematical analysis, but by a picture appealing to the imagination. Common language has grown by everyday experience and can never surpass these limits. Classical physics has restricted itself to the use of concepts of this kind; by analysing visible motions it has developed two ways of representing them by elementary processes; moving particles and waves. There is no other way of giving a pictorial description of motions -- we have to apply it even in the region of atomic processes, where classical physics breaks down.":$
 > $W≔\mathrm{CountUseOfEachWord}\left(\left[\mathrm{Asimov},\mathrm{Heisenberg},\mathrm{Born}\right],\left["science","language","atomic","describe","phrase","exciting","mathematical"\right]\right)$
 ${W}{:=}\left[\begin{array}{rrrrrrr}{1}& {0}& {0}& {0}& {1}& {1}& {0}\\ {0}& {2}& {2}& {3}& {0}& {0}& {1}\\ {0}& {2}& {1}& {1}& {0}& {0}& {1}\end{array}\right]$ (8)
 > $\mathrm{DiceCoefficient}\left({W}_{1},{W}_{2}\right)$
 ${0.}$ (9)
 > $\mathrm{DiceCoefficient}\left({W}_{2},{W}_{3}\right)$
 ${0.8000000000}$ (10)
 > $\mathrm{BinaryDiceCoefficient}\left({W}_{2},{W}_{3}\right)$
 ${1.}$ (11)
 > $\mathrm{allwords}≔\mathrm{map}\left(x→\mathrm{op}\left(\mathrm{StringTools}:-\mathrm{Words}\left(x\right)\right),\left\{\mathrm{Asimov},\mathrm{Heisenberg},\mathrm{Born}\right\}\right)$
 ${\mathrm{allwords}}{:=}\left\{{"\text{'}"}{,}{"\text{'}Eureka"}{,}{"\text{'}That\text{'}s"}{,}{"Common"}{,}{"It"}{,}{"The"}{,}{"There"}{,}{"a"}{,}{"ability"}{,}{"able"}{,}{"and"}{,}{"apply"}{,}{"are"}{,}{"as"}{,}{"atomic"}{,}{"atoms"}{,}{"be"}{,}{"been"}{,}{"breaks"}{,}{"but"}{,}{"by"}{,}{"can"}{,}{"common"}{,}{"consist"}{,}{"content"}{,}{"daily"}{,}{"down"}{,}{"even"}{,}{"fact"}{,}{"for"}{,}{"form"}{,}{"funny"}{,}{"giving"}{,}{"grown"}{,}{"has"}{,}{"have"}{,}{"hear"}{,}{"heralds"}{,}{"however"}{,}{"in"}{,}{"invent"}{,}{"is"}{,}{"it"}{,}{"itself"}{,}{"kind"}{,}{"large"}{,}{"lies"}{,}{"life"}{,}{"limits"}{,}{"logical"}{,}{"mental"}{,}{"modify"}{,}{"most"}{,}{"motions"}{,}{"moving"}{,}{"must"}{,}{"never"}{,}{"new"}{,}{"no"}{,}{"not"}{,}{"numbers"}{,}{"of"}{,}{"one"}{,}{"only"}{,}{"or"}{,}{"origin"}{,}{"other"}{,}{"our"}{,}{"phrase"}{,}{"physics"}{,}{"picture"}{,}{"quantum"}{,}{"region"}{,}{"result"}{,}{"scheme"}{,}{"science"}{,}{"seems"}{,}{"should"}{,}{"so"}{,}{"subject"}{,}{"surpass"}{,}{"that"}{,}{"the"}{,}{"them"}{,}{"theory"}{,}{"these"}{,}{"things"}{,}{"this"}{,}{"to"}{,}{"too"}{,}{"two"}{,}{"use"}{,}{"very"}{,}{"visible"}{,}{"was"}{,}{"wave"}{,}{"waves"}{,}{"way"}{,}{"ways"}{,}{"we"}{,}{"when"}{,}{"where"}{,}{"which"}{,}{"will"}{,}{"wish"}{,}{"with"}{,}{"within"}{,}{"words"}{,}{"Classical"}{,}{"Fortunately"}{,}{"Furthermore"}{,}{"adequate"}{,}{"analogies"}{,}{"analysing"}{,}{"analysis"}{,}{"appealing"}{,}{"classical"}{,}{"compelled"}{,}{"concepts"}{,}{"corpuscular"}{,}{"describe"}{,}{"describing"}{,}{"description"}{,}{"developed"}{,}{"difficult"}{,}{"difficulty"}{,}{"discoveries"}{,}{"elementary"}{,}{"entirely"}{,}{"everyday"}{,}{"exceedingly"}{,}{"exciting"}{,}{"experience"}{,}{"experiences"}{,}{"imagination"}{,}{"incapable"}{,}{"incomplete"}{,}{"invented"}{,}{"involving"}{,}{"language"}{,}{"limitation"}{,}{"mathematical"}{,}{"mathematics"}{,}{"occurring"}{,}{"ourselves"}{,}{"particles"}{,}{"phenomenon"}{,}{"philosophical"}{,}{"pictorial"}{,}{"pictures"}{,}{"possible"}{,}{"principle"}{,}{"processes"}{,}{"remarked"}{,}{"representing"}{,}{"restricted"}{,}{"surprising"}{,}{"treatment"}{,}{"ultimate"}{,}{"visualization"}\right\}$ (12)
 > $W≔\mathrm{CountUseOfEachWord}\left(\left[\mathrm{Asimov},\mathrm{Heisenberg},\mathrm{Born}\right],\mathrm{allwords}\right)$
 ${W}{:=}\left[\begin{array}{c}{\mathrm{1..3 x 1..160}}{\mathrm{Array}}\\ {\mathrm{Data Type:}}{\mathrm{anything}}\\ {\mathrm{Storage:}}{\mathrm{rectangular}}\\ {\mathrm{Order:}}{\mathrm{Fortran_order}}\end{array}\right]$ (13)
 > $\mathrm{DiceCoefficient}\left({W}_{1},{W}_{2}\right)$
 ${0.1447721180}$ (14)
 > $\mathrm{DiceCoefficient}\left({W}_{1},{W}_{3}\right)$
 ${0.1562500000}$ (15)
 > $\mathrm{DiceCoefficient}\left({W}_{2},{W}_{3}\right)$
 ${0.6294573643}$ (16)

Compatibility

 • The EssayTools[DiceCoefficient] and EssayTools[BinaryDiceCoefficient] commands were introduced in Maple 17.