A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ablow

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to bathan-machines in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
bathan-machines (0) - 1 freq
bathin-machines (1) - 1 freq
washinmachines (4) - 1 freq
wishin-machines (4) - 1 freq
washinmachine (5) - 1 freq
dippin-machines (5) - 1 freq
time-machine (6) - 1 freq
machines (7) - 48 freq
lagamachie (7) - 1 freq
hannahhiles (7) - 1 freq
tax-mannies (7) - 3 freq
hanlawhiles (7) - 2 freq
lamgamachie (7) - 1 freq
bathin-huts (7) - 1 freq
mathematies (7) - 2 freq
langamachie (7) - 3 freq
bahoochies (7) - 1 freq
shenachies (7) - 1 freq
machine (8) - 163 freq
batman's (8) - 1 freq
tax-mannie (8) - 3 freq
brianjmckigen (8) - 1 freq
thatchin (8) - 2 freq
anomalies (8) - 2 freq
baa-faced (8) - 1 freq
bathan-machines (0) - 1 freq
bathin-machines (1) - 1 freq
wishin-machines (6) - 1 freq
washinmachines (7) - 1 freq
dippin-machines (8) - 1 freq
washinmachine (9) - 1 freq
bathin-huts (10) - 1 freq
time-machine (10) - 1 freq
shenachies (12) - 1 freq
hause-chynes (12) - 1 freq
gethincjones (12) - 10 freq
bahoochies (12) - 1 freq
machines (12) - 48 freq
tax-mannies (12) - 3 freq
stamachs (13) - 2 freq
athenians (13) - 1 freq
taichins (13) - 1 freq
bluid-matchin (13) - 1 freq
tranches (13) - 1 freq
bairn-rhymes (13) - 4 freq
transactions (13) - 5 freq
fush-n-chips (13) - 1 freq
teachins (13) - 4 freq
branchin (13) - 3 freq
brain-washin (13) - 1 freq
SoundEx code - B352
bathin-machines - 1 freq
buttons - 55 freq
badness - 12 freq
biddins - 12 freq
betimes - 16 freq
'biddins - 1 freq
biddens - 1 freq
bottoms - 6 freq
bidding - 4 freq
bethank - 1 freq
bytimes - 5 freq
bidie-in's - 1 freq
beatins - 3 freq
bathing - 1 freq
bedding - 1 freq
batons - 1 freq
bidie-ins - 1 freq
boddom's - 1 freq
butaince - 1 freq
'botanical - 1 freq
botanical - 4 freq
buddoms - 1 freq
betymes - 6 freq
beeteen's - 1 freq
baetims - 2 freq
battens - 2 freq
bathan-machines - 1 freq
bedtime's - 1 freq
bathans - 1 freq
batwing - 1 freq
badinage - 2 freq
biting - 1 freq
biding - 9 freq
beating - 7 freq
byde-ins--- - 1 freq
bhatnagar - 1 freq
boattoms - 1 freq
btmknxu - 1 freq
betamax’s - 1 freq
bettinasross - 1 freq
biden's - 2 freq
booting - 2 freq
betting - 2 freq
botanist - 1 freq
boating - 1 freq
bethenec - 1 freq
MetaPhone code - B0NMXNS
bathin-machines - 1 freq
bathan-machines - 1 freq
BATHAN-MACHINES
Time to execute Levenshtein function - 0.216219 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.490685 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.031342 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.045624 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000953 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.