WebCorp Advanced Wordlist Generator Guide Reviews   Feedback
 
Background
How does it work?
 Basic Options
Pattern Matching
Advanced Options: Format
Advanced Options: Concordances
Advanced Options: Domains
Advanced Options: Word Filter
Advanced Options: Date Filter
Advanced Options: Collocation
Advanced Options: Hypertext
Post-Processing
Other Tools
Further Reading

 
Pattern Matching

With WebCorp you can search for a word, phrase or pattern. Patterns are formed using wildcards (*) and groups of characters enclosed in square brackets, separated by the pipe (|) character.

- Wildcards
You can use the wildcard at the end of a string (e.g. run* will match running, runners, etc) or you can use the wildcard to stand for a whole word in a phrase (e.g. the * sank will match the boat sank, the ship sank, the ferry sank, etc).

Multiple wildcards can be used within the same phrase and wildcards can be used in adjacent positions (e.g. the * * sank will match the `unsinkable' ship sank, the ship had sank, etc). Note that wildcards at either the first or last position in a phrase are unnecessary.

You can use the wildcard at the beginning of a string within a phrase, e.g. the *ing man will match the running man, the laughing man, etc. However, limited search engine support means that this use of wildcards is experimental at present and you may have more success if you specify a set of words, as described below.

- Groups of characters or words
While the * sank will match any 3 word phrase beginning with the and ending with sank, the pattern the [ship|boat] sank will only match the ship sank or the boat sank.

Brackets can also be used to specify alternative characters within a word, e.g.  the [ship|boat] s[a|u]nk will match the ship sank, the ship sunk, the boat sank or the boat sunk.

A more complex example is the use of r[u|a]n[ning|s|] to match running, runs, run, ranning, rans, ran (some of which may be grammatically invalid in English, although rans is valid in also rans). The 'empty' position in the brackets at the end of the pattern - n[ning|s|] - allows the word to end in n+ning, n+s or simply n.

Note: Pattern Matching can be used from either the Basic or Advanced Interface.

Next: Advanced Options: Format >>

 
[an error occurred while processing this directive]