construct the dictionary
search a string for occurrences of patterns in the dictionary
return the number of patterns in the dictionary
retrieve a pattern from the dictionary
factor a string over the dictionary
Create( patlist )
Search( dict, text )
Get( dict, id )
Size( dict )
Factors( s, dict )
type( expr, dictionary )
dictionary; dictionary object returned by Create
string; text to be searched
list of strings or the name builtin; patterns to search for
string; string to be factored
positive integer; index
anything; Maple expression
The PatternDictionary subpackage of the StringTools package provides for persistent multiple pattern matching. It maintains a search data structure for multiple patterns (called the dictionary), against which text strings may be searched. This allows you to handle the missing case in StringTools[Search] where there are multiple patterns and multiple texts. The PatternDictionary subpackage provides facilities for managing this list of patterns.
Five procedures are provided for managing and using the search patterns.
The Create(patlist) command constructs the dictionary. The single argument patlist must be a list of strings, or the special name builtin. When invoked with the argument builtin, a small list of English and Maple words is used. This is intended primarily for system use, but is also available to users.
Note: The sets of patterns are not allowed as arguments to Create. This is because each pattern is associated with an unique ID that is used to identify it by its position in the pattern list. Thus, pattern lists are inherently ordered.
When a dictionary is created, an external search data structure is created and used for subsequent searches using Search. The storage used for the search automaton is released when the dictionary object to which it is associated is garbage collected, a restart command is issued, or when the Maple process is terminated.
The Search(dict, text) command searchs a string text for occurrences of patterns in the dictionary dict. It returns an expression sequence of pairs of the form offset,id, where offset is positive integer giving the offset into text of the match it represents, and id is the index of the matching pattern in the current pattern list.
The Size(dict) command returns the number of patterns in the dictionary dict as a Maple integer.
Each pattern in a dictionary dict is associated with a unique, positive integral ID that identifies it uniquely within that dictionary. The pattern whose ID is id can be retrieved from the dictionary by using the Get(dict, id) command.
A string s can be factored over a dictionary dict by using the Factors(s, dict) command. It determines a sequence of strings in the dictionary that, when concatenated, form the input string s. Note that multiple factorizations are possible.
All of the StringTools package commands treat strings as (null-terminated) sequences of 8-bit (ASCII) characters. Thus, there is no support for multibyte character encodings, such as unicode encodings.
bid ≔ Create⁡'builtin':
dict ≔ Create⁡fee,foo:
s ≔ Search⁡dict,afoot
dict ≔ Create⁡fee,foo,foe:
r ≔ Search⁡dict,defoe
Error, (in aux) no factorization
You can replace the small built-in dictionary with a better word list.
WordList ≔ /usr/share/dict/words:
WordList ≔ FileTools:-JoinPath⁡kernelopts⁡':-datadir',help,StringTools,words.dat:
dict ≔ Create⁡remove⁡type,Split⁡readbytes⁡WordList,'TEXT',∞,:
Of course, many applications will not use English words.
dict ≔ Create⁡seq⁡ThueMorse⁡i,i=10..1000:
dict ≔ Create⁡seq⁡Fibonacci⁡i,a,b,i=2..10
dict ≔ moduleoptionunload=free,dictionary;localdid,stringlist,free;exportget,size,search;descriptiona dictionary object;end module
Download Help Document