A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to somehing in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
somehing (0) - 6 freq
somehin' (1) - 1 freq
something (1) - 441 freq
somehin (1) - 30 freq
soughing (2) - 1 freq
somethings (2) - 1 freq
somehin's (2) - 1 freq
soothing (2) - 3 freq
someyin (2) - 2 freq
somthin (2) - 6 freq
somchin (2) - 1 freq
sumhing (2) - 4 freq
sumthing (2) - 24 freq
somethins (2) - 2 freq
somethin (2) - 1079 freq
sometin (2) - 7 freq
somethin' (2) - 10 freq
somehou (3) - 20 freq
soothin' (3) - 1 freq
coming (3) - 116 freq
smoking (3) - 13 freq
omethin (3) - 1 freq
shewing (3) - 1 freq
soberin (3) - 2 freq
bombing (3) - 2 freq
somehing (0) - 6 freq
sumhing (2) - 4 freq
somehin (2) - 30 freq
something (2) - 441 freq
somehin' (2) - 1 freq
sumthing (3) - 24 freq
soothing (3) - 3 freq
soughing (3) - 1 freq
smiling (4) - 16 freq
smoking (4) - 13 freq
sooming (4) - 2 freq
sumhins (4) - 1 freq
sumhin (4) - 37 freq
souchong (4) - 1 freq
smashing (4) - 6 freq
somethin' (4) - 10 freq
seeming (4) - 1 freq
somthin (4) - 6 freq
somchin (4) - 1 freq
someyin (4) - 2 freq
somehin's (4) - 1 freq
somethings (4) - 1 freq
somethin (4) - 1079 freq
somethins (4) - 2 freq
sometin (4) - 7 freq
SoundEx code - S552
someone's - 4 freq
shenanigans - 8 freq
scanning - 1 freq
seaman's - 1 freq
sameness - 1 freq
summin's - 1 freq
sea-monsters - 1 freq
shining - 8 freq
'someones - 1 freq
summons - 3 freq
showmanship - 2 freq
seemingly - 14 freq
'scummins' - 1 freq
simon's - 2 freq
seemon's - 4 freq
smawness - 1 freq
someens - 2 freq
some-eans - 1 freq
swimming - 7 freq
sooming - 2 freq
simmins - 1 freq
sinians - 1 freq
simmans - 1 freq
someeen's - 1 freq
saimeness - 1 freq
some'hing's - 2 freq
shamanistic - 1 freq
sinnons - 1 freq
sumeen's - 1 freq
shamanic - 2 freq
showman-cum-grocer - 2 freq
someanes - 1 freq
seeming - 1 freq
sumhin's - 1 freq
sumhins - 1 freq
sweeming - 1 freq
somehing - 6 freq
sumhing - 4 freq
somehin's - 1 freq
smaaness - 1 freq
sooning - 1 freq
snowing - 4 freq
sumink - 1 freq
seamanship - 1 freq
symington - 1 freq
smaoineachadh - 1 freq
simmons - 1 freq
skinning - 1 freq
summming - 1 freq
sunning - 1 freq
shannons - 1 freq
shaunamacd - 1 freq
somewan's - 3 freq
someen's - 1 freq
shaming - 1 freq
sweenyness - 1 freq
someones - 1 freq
MetaPhone code - SMHNK
somehing - 6 freq
sumhing - 4 freq
SOMEHING
Time to execute Levenshtein function - 0.213415 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.349117 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.027904 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.036668 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000810 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.