compute the Entropy of a string
Entropy( s )
The Entropy(s) command returns the Shannon entropy of the string s. A floating-point number, the entropy of the string, is returned.
Shannon's entropy is defined as -add( P( ch ) * log[ 2 ]( P( ch ) ), ch = Support( s ) ), where P⁡ch=CountCharacterOccurrences⁡s,chlength⁡s. It is a measure of the information content of the string, and can be interpreted as the number of bits required to encode each character of the string given perfect compression. The entropy is maximal when each character is equally likely. For arbitrary non-null characters, this maximal value is log2⁡255=7.99435.
(The null byte, with code point 0, cannot appear in a Maple string. If all 256 single byte code points could appear, then the maximal entropy would be log2⁡256=8, which is the number of bits per byte).
Note that the entropy is computed as a floating-point number, at hardware (double) precision.
All of the StringTools package commands treat strings as (null-terminated) sequences of 8-bit (ASCII) characters. Thus, there is no support for multibyte character encodings, such as unicode encodings.
Entropy( Iota( 1, 255 ) );
The following steps illustrate the definition of Entropy.
s ≔ Random⁡30,'lower'
occ ≔ seq⁡CountCharacterOccurrences⁡s,ch,ch=Support⁡s
L ≔ map⁡`/`,occ,length⁡s
U ≔ map⁡p→−evalf⁡p⁢log2⁡p,L
Download Help Document