A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example sonsie

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to pilgrimage in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
pilgrimage (0) - 12 freq
pilgrim's (3) - 7 freq
pilgrim (3) - 14 freq
pilgrims (3) - 15 freq
pilgrimers (3) - 2 freq
pilgrimer (3) - 4 freq
self-image (4) - 2 freq
grimace (4) - 3 freq
pilrims (4) - 1 freq
pilgremer (4) - 1 freq
pillage (4) - 4 freq
gilravage (4) - 1 freq
pilgim (4) - 1 freq
plumage (4) - 2 freq
pilie (5) - 1 freq
intricate (5) - 13 freq
jigtime (5) - 1 freq
patronage (5) - 3 freq
inanimate (5) - 2 freq
pantridge (5) - 1 freq
eildrig (5) - 2 freq
williame (5) - 1 freq
vilage (5) - 1 freq
pierian (5) - 1 freq
pillowcase (5) - 1 freq
pilgrimage (0) - 12 freq
pilgrims (4) - 15 freq
pilgrimer (4) - 4 freq
pilgrim (4) - 14 freq
pilgrim's (5) - 7 freq
pilgrimers (5) - 2 freq
pilgremer (5) - 1 freq
pilrims (6) - 1 freq
plumage (6) - 2 freq
pilgim (6) - 1 freq
programme (7) - 102 freq
pilaraymara (7) - 2 freq
self-image (7) - 2 freq
grimace (7) - 3 freq
gilravage (7) - 1 freq
pillage (7) - 4 freq
gairage (8) - 6 freq
piling (8) - 1 freq
belgrade (8) - 1 freq
grime (8) - 5 freq
primsie (8) - 1 freq
primal (8) - 2 freq
program (8) - 24 freq
playmate (8) - 1 freq
sell-eimage (8) - 1 freq
SoundEx code - P426
playgrund - 29 freq
pilgrim's - 7 freq
pleasure - 71 freq
pleisures - 4 freq
plaisure - 2 freq
pleesure - 23 freq
plooshares - 1 freq
pilgrimage - 12 freq
pleasures - 11 freq
pleisur - 32 freq
playgroup - 4 freq
pleisure - 60 freq
pleyscrievin - 2 freq
playgrun - 17 freq
playgruns - 1 freq
pilgrims - 15 freq
pilgrim - 14 freq
pleisurit - 1 freq
pleesher - 2 freq
pleisour - 1 freq
playgroond - 6 freq
play-gruns - 1 freq
playground - 6 freq
polygraph - 1 freq
pleesuir - 2 freq
pleesuirs - 2 freq
pleesures - 3 freq
pliesjir - 1 freq
plagiarist - 1 freq
plaisir - 3 freq
plaesur - 2 freq
ploushare - 2 freq
plooshare - 2 freq
pleasour - 1 freq
pilgrimer - 4 freq
pluscarden - 1 freq
pleisurable - 1 freq
pilgrimers - 2 freq
pleisir - 3 freq
plagiarisin - 1 freq
playgrunn - 2 freq
pleisur-snowker - 1 freq
plagiarism - 2 freq
plei-sured - 1 freq
pleesurin - 1 freq
pilgremer - 1 freq
pleygroup - 1 freq
pleisured - 1 freq
pleisurs - 1 freq
pleygroups - 1 freq
policework - 1 freq
placards - 2 freq
placard - 2 freq
pleygrund - 1 freq
pleygrun - 1 freq
playgroun - 2 freq
pylqzqr - 1 freq
pauljcorrigan - 1 freq
pilchard - 1 freq
polisher - 1 freq
paulgardinerdj - 1 freq
plucker - 1 freq
MetaPhone code - PLKRMJ
pilgrimage - 12 freq
PILGRIMAGE
Time to execute Levenshtein function - 0.209830 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.404110 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.029255 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.041386 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000905 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.