A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ablow

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to caution in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
caution (0) - 9 freq
cautions (1) - 1 freq
caption (1) - 1 freq
caumin (2) - 2 freq
calton (2) - 10 freq
caxton (2) - 1 freq
castin (2) - 48 freq
mautioun (2) - 1 freq
captioun (2) - 1 freq
clautin (2) - 1 freq
paction (2) - 18 freq
ration (2) - 4 freq
captin (2) - 2 freq
creaution (2) - 2 freq
causin (2) - 14 freq
sautin (2) - 1 freq
cauin (2) - 1 freq
bastion (2) - 2 freq
caudron (2) - 1 freq
captions (2) - 1 freq
dautin (2) - 2 freq
cartoon (2) - 13 freq
canton (2) - 2 freq
faction (2) - 3 freq
carton (2) - 7 freq
caution (0) - 9 freq
auction (2) - 3 freq
action (2) - 138 freq
cautions (2) - 1 freq
caption (2) - 1 freq
cauvin (3) - 1 freq
cautious (3) - 8 freq
pautin (3) - 1 freq
faction (3) - 3 freq
carton (3) - 7 freq
cautioner (3) - 1 freq
cantin (3) - 4 freq
naition (3) - 1 freq
coatin (3) - 1 freq
citin (3) - 1 freq
actin (3) - 58 freq
ection (3) - 16 freq
actioun (3) - 15 freq
canton (3) - 2 freq
cartin (3) - 1 freq
caukin (3) - 1 freq
nation (3) - 143 freq
clautin (3) - 1 freq
paction (3) - 18 freq
ration (3) - 4 freq
SoundEx code - C350
cuttin - 73 freq
cut-doon - 2 freq
cuidnae - 135 freq
caution - 9 freq
coudna - 47 freq
cotton - 25 freq
cuidna - 94 freq
cudnae - 144 freq
cudna - 165 freq
chattin - 22 freq
cheatin - 4 freq
cotton-woo - 2 freq
cidna - 2 freq
cidnae - 2 freq
cwidna - 47 freq
chidin - 1 freq
cuttin' - 1 freq
coodna - 71 freq
coddin - 2 freq
chaitin - 3 freq
cud'nae - 1 freq
'cudna - 1 freq
coudnae - 19 freq
cuttan - 4 freq
chaetin - 1 freq
cheatan - 1 freq
coudno - 3 freq
cadona - 6 freq
coodnae - 52 freq
cuddie-an - 1 freq
chatham - 1 freq
cydonia - 2 freq
cweedna - 2 freq
cidni - 1 freq
cottown - 1 freq
chattan - 1 freq
€œcudna - 1 freq
coatin - 1 freq
citin - 1 freq
chutney - 4 freq
cowden - 7 freq
codeine - 1 freq
ctyem - 1 freq
cudnea - 2 freq
MetaPhone code - KXN
kitchen - 420 freq
catchin - 52 freq
caution - 9 freq
cushion - 28 freq
cushin - 3 freq
gushin - 4 freq
kychin - 1 freq
catchen - 2 freq
cooshen - 1 freq
catchin' - 2 freq
cashin - 1 freq
keetcheen - 2 freq
catchan - 12 freq
ketchin - 1 freq
keetchen - 9 freq
cöshin - 2 freq
kitcheen - 6 freq
keetchin - 3 freq
kitchin - 2 freq
keichin - 1 freq
kaatchin - 2 freq
kooshin - 1 freq
keechin - 5 freq
gowchin - 1 freq
keichen - 3 freq
coushin - 1 freq
gauchin - 1 freq
kitchn - 1 freq
coachin - 1 freq
'kitchen' - 1 freq
cooshion - 1 freq
cooshin - 1 freq
CAUTION
Time to execute Levenshtein function - 0.185787 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.349309 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.027197 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.036465 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000839 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.