A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to machar in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
machar (0) - 9 freq
machair (1) - 3 freq
machars (1) - 4 freq
tackar (2) - 1 freq
mcharg (2) - 1 freq
makkar (2) - 3 freq
macwhir (2) - 1 freq
char (2) - 2 freq
macca (2) - 12 freq
matha (2) - 2 freq
macao (2) - 1 freq
schar (2) - 1 freq
macaw (2) - 1 freq
machts (2) - 2 freq
maha (2) - 3 freq
marcha (2) - 1 freq
macho (2) - 9 freq
macari (2) - 1 freq
makar (2) - 98 freq
machin (2) - 1 freq
mach (2) - 2 freq
mackay (2) - 25 freq
achan (2) - 2 freq
macrae (2) - 5 freq
macht (2) - 3 freq
machar (0) - 9 freq
machair (1) - 3 freq
machars (2) - 4 freq
macrae (3) - 5 freq
macht (3) - 3 freq
mach (3) - 2 freq
macari (3) - 1 freq
michal (3) - 1 freq
machin (3) - 1 freq
smacher (3) - 2 freq
teachar (3) - 1 freq
moocher (3) - 2 freq
muchas (3) - 1 freq
macho (3) - 9 freq
acher (3) - 1 freq
mocha (3) - 2 freq
mjchr (3) - 1 freq
schar (3) - 1 freq
mcharg (3) - 1 freq
macwhir (3) - 1 freq
char (3) - 2 freq
mcht (4) - 1 freq
hicher (4) - 1 freq
teichar (4) - 2 freq
michta (4) - 7 freq
SoundEx code - M260
measure - 27 freq
meesure - 2 freq
makar - 98 freq
major - 109 freq
misery - 31 freq
maker - 18 freq
mascara - 3 freq
meesery - 5 freq
mixer - 5 freq
maugre - 34 freq
mucker - 11 freq
meisure - 20 freq
moger - 1 freq
'major - 1 freq
mcguire - 4 freq
meisur - 18 freq
mockery - 9 freq
meagre - 9 freq
maigre - 1 freq
mauger - 8 freq
machair - 3 freq
mowser - 12 freq
micro - 3 freq
mckerrow - 2 freq
makker - 7 freq
makar' - 1 freq
maisser - 1 freq
majer - 1 freq
mayjer - 1 freq
misure - 8 freq
m'grew - 1 freq
makeower - 1 freq
meissure - 1 freq
micra - 1 freq
masseur - 2 freq
maguire - 26 freq
miser - 4 freq
mazr - 1 freq
mizzour - 3 freq
miesjir - 1 freq
mouser - 3 freq
mizzer - 4 freq
mcr - 1 freq
'micro' - 3 freq
makkar - 3 freq
maaker - 1 freq
maeshur - 1 freq
mcgraw - 1 freq
missure - 1 freq
émigré - 1 freq
maskara - 1 freq
maisure - 1 freq
miserie - 5 freq
megara - 3 freq
macro - 2 freq
maager - 1 freq
'makar - 1 freq
meiserie - 1 freq
macrae - 5 freq
machar - 9 freq
miscarry - 2 freq
misyur - 1 freq
mcrae - 4 freq
mcwhir - 1 freq
macwhir - 1 freq
mcquarry - 2 freq
maisrie - 2 freq
measuir - 1 freq
mizzure - 1 freq
mjr - 23 freq
mccr - 1 freq
mikeyr - 1 freq
moocher - 2 freq
mogre - 1 freq
mcgerry - 1 freq
meysr - 1 freq
mjchr - 1 freq
macari - 1 freq
MetaPhone code - MXR
machair - 3 freq
maeshur - 1 freq
machar - 9 freq
moocher - 2 freq
MACHAR
Time to execute Levenshtein function - 0.188648 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.371284 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.027916 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.037324 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000943 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.