A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example sonsie

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to smithsonian in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
smithsonian (0) - 3 freq
smitherin (4) - 1 freq
lithuanian (4) - 1 freq
smithin (4) - 1 freq
sinfonia (5) - 1 freq
motionin (5) - 1 freq
swithin (5) - 1 freq
sizzonin (5) - 1 freq
etonian (5) - 1 freq
smiths (5) - 3 freq
amazonian (5) - 1 freq
sair-sounin (5) - 1 freq
smotherin (5) - 2 freq
smithermann (5) - 1 freq
shonin (5) - 1 freq
slitherin (5) - 2 freq
smutherin (5) - 1 freq
poisonin (5) - 3 freq
smithhill (5) - 1 freq
smitteneen (5) - 1 freq
athenian (5) - 14 freq
smuithin (5) - 1 freq
smitsome (5) - 1 freq
saisoneen (5) - 1 freq
mithna (5) - 2 freq
smithsonian (0) - 3 freq
smithin (6) - 1 freq
smitherin (6) - 1 freq
smotherin (7) - 2 freq
smutherin (7) - 1 freq
smitteneen (7) - 1 freq
smiths (7) - 3 freq
smuithin (7) - 1 freq
smithermann (7) - 1 freq
lithuanian (7) - 1 freq
switherin (8) - 27 freq
seithin (8) - 1 freq
stoonin (8) - 7 freq
sumthein (8) - 4 freq
sithrin (8) - 1 freq
mitherin (8) - 7 freq
smitin (8) - 1 freq
simethin (8) - 1 freq
smithereens (8) - 7 freq
synthesisin (8) - 1 freq
simthin (8) - 11 freq
smeethin (8) - 2 freq
sithean (8) - 1 freq
mooths-an (8) - 3 freq
somethin (8) - 1119 freq
SoundEx code - S532
sounds - 115 freq
snetchit - 1 freq
smudges - 2 freq
saunds - 5 freq
soonds - 248 freq
sundays - 15 freq
sends - 44 freq
snatch - 14 freq
soonds'll - 2 freq
sands - 24 freq
sandwiches - 35 freq
snatches - 7 freq
smitch - 8 freq
sandwich - 16 freq
saundstane - 5 freq
senatus - 2 freq
sandy's - 15 freq
saund-kelpie - 1 freq
saund-kelpies - 1 freq
'saund-kelpie' - 1 freq
'scents - 1 freq
scientists - 15 freq
syntax - 20 freq
synthesis - 3 freq
synthesisin - 1 freq
synthesin - 1 freq
saunts - 5 freq
somedy's - 2 freq
smudged - 4 freq
scientist - 11 freq
semmits - 9 freq
smiths - 3 freq
saints - 18 freq
smiddy's - 2 freq
snouts - 2 freq
smidgin - 2 freq
sonnets - 8 freq
semitic - 1 freq
santiago - 2 freq
smudge - 11 freq
sandsteen - 2 freq
sandstane - 7 freq
sandwichees - 1 freq
soundcheck - 4 freq
soundies - 2 freq
soundcheck's - 1 freq
soundchecks - 1 freq
sandwhich - 1 freq
squints - 2 freq
shawiands - 1 freq
sandcastle - 1 freq
'sounds - 4 freq
snitchers - 1 freq
scents - 5 freq
snaw-white's - 1 freq
shanties - 1 freq
saundcastles - 1 freq
santa's - 9 freq
sanitiser - 3 freq
sanitizer - 3 freq
snitches - 2 freq
smowts - 1 freq
saunt's - 1 freq
snatched - 9 freq
sandwiched - 2 freq
some'dy's - 2 freq
scandic - 3 freq
skin-ticht - 1 freq
sunday's - 3 freq
sumdy's - 3 freq
scants - 1 freq
snootcloot - 2 freq
sneds - 3 freq
skinticht - 1 freq
suntie's - 2 freq
summits - 5 freq
saundshoe - 1 freq
sonnets' - 1 freq
snodcakes - 1 freq
snodcake - 1 freq
syndes - 1 freq
smoots - 2 freq
smith's - 6 freq
sandwicht - 2 freq
smooths - 1 freq
simmets - 2 freq
snatchets - 1 freq
smoothy's - 1 freq
smuthick - 1 freq
sinths - 1 freq
sinthesised - 1 freq
sommat's - 2 freq
sandstone - 6 freq
smitts - 2 freq
sandisans - 1 freq
sandside - 1 freq
sandsend - 1 freq
sandsgarth - 1 freq
sanitised - 2 freq
snitch - 8 freq
smitsome - 1 freq
sun-dicht - 1 freq
simmet's - 1 freq
somedie's - 1 freq
sanitise - 1 freq
synds - 2 freq
snaitched - 1 freq
snaitchin - 1 freq
sand-stane - 1 freq
skinheids - 1 freq
snoots - 4 freq
smaads - 1 freq
seimits - 1 freq
saands - 1 freq
smutchack - 1 freq
schnitzel - 1 freq
sand-clogged - 1 freq
sandshun - 1 freq
soondscapes - 1 freq
sceintic - 1 freq
soond's - 1 freq
sunties - 2 freq
smeeth-caimbed - 1 freq
smits - 1 freq
smithsonian - 3 freq
smithsonianfolklifefestival - 1 freq
snatchan - 1 freq
sandsound - 3 freq
shindig - 3 freq
semiotics - 1 freq
sandstrøm - 1 freq
sonatas - 2 freq
snatchers - 1 freq
sanitisin - 1 freq
snatchin - 1 freq
snatcher - 1 freq
syntactic - 1 freq
sandy-coloured - 1 freq
sandstorm - 1 freq
smatchet - 1 freq
smiddies - 1 freq
sanitising - 1 freq
€œsoonds - 1 freq
sumdys - 1 freq
sants - 1 freq
syndicalists - 1 freq
soand-so - 1 freq
somatic - 1 freq
snaw-dusted - 1 freq
shandwick - 1 freq
syntactical - 2 freq
somedae's - 1 freq
smootie's - 6 freq
'smootie's - 1 freq
smidgen - 1 freq
smtxabi - 1 freq
smtxjbdlt - 1 freq
sandiescot - 2 freq
syndicate - 1 freq
smithÂ’s - 1 freq
snoot-cloot - 1 freq
sundayÂ’s - 1 freq
snitching - 1 freq
shindog - 1 freq
shandys - 1 freq
shandies - 1 freq
sundaycreaking - 1 freq
semtex - 1 freq
soundcloud - 3 freq
snowthistle - 1 freq
sinethugcat - 2 freq
smithycroftlt - 2 freq
smithycroft - 4 freq
sannytizer - 1 freq
sandwick - 3 freq
santy's - 1 freq
smithycrofteng - 5 freq
'sandwich' - 1 freq
snitch' - 1 freq
sandstonepress - 2 freq
sandys - 8 freq
santas - 2 freq
sundayshoutsfc - 3 freq
's math sin - 1 freq
snettsbirder - 5 freq
sendsnowdayhelp - 1 freq
saintso - 1 freq
MetaPhone code - SM0SNN
smithsonian - 3 freq
SMITHSONIAN
Time to execute Levenshtein function - 0.315884 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.768658 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.030803 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.166319 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.001241 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.