A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to eejit in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
eejit (0) - 71 freq
eeyjit (1) - 1 freq
eeyit (1) - 1 freq
eijit (1) - 1 freq
eegit (1) - 4 freq
ejit (1) - 1 freq
eekit (1) - 6 freq
eeejit (1) - 1 freq
eedjit (1) - 7 freq
egejit (1) - 1 freq
ejjit (1) - 1 freq
eejits (1) - 50 freq
hejit (1) - 1 freq
leemit (2) - 14 freq
eediot (2) - 5 freq
remit (2) - 11 freq
seedit (2) - 1 freq
ees't (2) - 8 freq
jit (2) - 2 freq
eatit (2) - 1 freq
weedit (2) - 1 freq
heezit (2) - 4 freq
kepit (2) - 3 freq
semit (2) - 1 freq
exit (2) - 28 freq
eejit (0) - 71 freq
ejit (1) - 1 freq
eeyjit (1) - 1 freq
eijit (1) - 1 freq
eeejit (1) - 1 freq
eejits (2) - 50 freq
ejjit (2) - 1 freq
jit (2) - 2 freq
hejit (2) - 1 freq
jeit (2) - 2 freq
egejit (2) - 1 freq
eegit (2) - 4 freq
eeyit (2) - 1 freq
eedjit (2) - 7 freq
eekit (2) - 6 freq
eikit (3) - 65 freq
eijits (3) - 1 freq
eet (3) - 581 freq
feit (3) - 24 freq
jet (3) - 16 freq
deit (3) - 9 freq
deeit (3) - 1 freq
peyit (3) - 9 freq
seit (3) - 1 freq
leit (3) - 15 freq
SoundEx code - E230
eichty - 7 freq
eicht - 61 freq
eesed - 65 freq
eest - 24 freq
eight - 69 freq
est - 22 freq
eaught - 1 freq
eejit - 71 freq
eiked - 3 freq
eighth - 4 freq
east - 307 freq
eikit - 65 freq
echtie - 4 freq
eschewed - 2 freq
echoed - 15 freq
echt - 115 freq
exit - 28 freq
eeejit - 1 freq
eegit - 4 freq
ecuid - 3 freq
echaed - 2 freq
eked - 3 freq
echae'd - 1 freq
egged - 5 freq
eeight - 1 freq
eichtie - 2 freq
echty - 22 freq
eestae - 3 freq
eastae - 1 freq
ejit - 1 freq
eesta - 1 freq
exude - 4 freq
eighty - 13 freq
eggheid - 1 freq
excite - 2 freq
'eicht - 2 freq
ect - 25 freq
echth - 1 freq
eaucht - 1 freq
eeyjit - 1 freq
eyght - 4 freq
¬‚eggit - 1 freq
egt - 1 freq
equate - 5 freq
exceed - 1 freq
eekit - 6 freq
esto - 2 freq
equity - 7 freq
eskside - 1 freq
ees't - 8 freq
eeside - 1 freq
eastawa - 3 freq
ekit - 1 freq
eased - 3 freq
eigged - 1 freq
eiged - 1 freq
€˜east - 1 freq
'exit' - 1 freq
€œeight - 1 freq
€™est - 4 freq
€œexit - 1 freq
eused - 1 freq
€˜eighty - 1 freq
ekd - 1 freq
echyty - 1 freq
eoggudh - 1 freq
exdt - 1 freq
ezd - 1 freq
eijit - 1 freq
eaxxd - 1 freq
ejjit - 1 freq
egid - 1 freq
esd - 1 freq
ezsgwt - 1 freq
eyesite - 1 freq
exsdu - 1 freq
MetaPhone code - EJT
edged - 9 freq
eejit - 71 freq
eeejit - 1 freq
eegit - 4 freq
edgit - 1 freq
ejit - 1 freq
eeyjit - 1 freq
aeged - 1 freq
eiged - 1 freq
eijit - 1 freq
ejjit - 1 freq
egid - 1 freq
EEJIT
Time to execute Levenshtein function - 0.297642 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.537060 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.058039 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.036840 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000883 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.