A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example sonsie

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to surmise in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
surmise (0) - 3 freq
surmised (1) - 3 freq
surprise (2) - 210 freq
survive (2) - 69 freq
sunrise (2) - 14 freq
urwise (2) - 1 freq
hurrie (3) - 1 freq
purse (3) - 38 freq
strive (3) - 8 freq
urie (3) - 1 freq
strite (3) - 1 freq
summits (3) - 5 freq
curnie (3) - 6 freq
sumink (3) - 1 freq
barmie (3) - 1 freq
sumtime (3) - 3 freq
burnist (3) - 1 freq
wurse (3) - 16 freq
supervise (3) - 3 freq
cursive (3) - 2 freq
muise (3) - 1 freq
bruise (3) - 15 freq
sprose (3) - 2 freq
sortie (3) - 2 freq
atomise (3) - 1 freq
surmise (0) - 3 freq
surmised (2) - 3 freq
permis (4) - 1 freq
promise (4) - 119 freq
aurms (4) - 1 freq
rumse (4) - 2 freq
surks (4) - 1 freq
surhoose (4) - 1 freq
sure's (4) - 3 freq
wurms (4) - 3 freq
sums (4) - 43 freq
siamese (4) - 1 freq
urwise (4) - 1 freq
furms (4) - 9 freq
survive (4) - 69 freq
surprise (4) - 210 freq
surfies (4) - 2 freq
sunrise (4) - 14 freq
surmatyse (4) - 1 freq
semis (4) - 4 freq
armies (5) - 4 freq
serrs (5) - 5 freq
worms (5) - 22 freq
skims (5) - 2 freq
sermons (5) - 9 freq
SoundEx code - S652
scrunched - 6 freq
shrink - 14 freq
shrinkin - 17 freq
sirens - 11 freq
shrunk - 9 freq
screams - 33 freq
scroongin - 2 freq
scarns - 1 freq
skirmish - 1 freq
screens - 22 freq
shrank - 3 freq
soorness - 3 freq
scrans - 3 freq
scrums - 1 freq
shairnesses - 1 freq
sourness - 1 freq
scronach - 2 freq
scaring - 4 freq
swearing - 2 freq
scrunches - 4 freq
screen's - 1 freq
siurring - 1 freq
scrunchin - 2 freq
scrounger - 1 freq
skrinkit - 1 freq
sairness - 8 freq
sharing - 26 freq
'sharing - 1 freq
scrounge - 3 freq
'scroungers' - 1 freq
scroungers - 1 freq
sharon's - 6 freq
syringe - 5 freq
scrymsour - 1 freq
skirmisher - 1 freq
scrymgeour - 2 freq
scrunchit-up - 1 freq
scruncht - 2 freq
scrunchit - 1 freq
scringein - 3 freq
scringin - 1 freq
skrunkled - 3 freq
screengin - 24 freq
shearing - 1 freq
syringes - 1 freq
skrunklin - 3 freq
skrankie - 4 freq
surmise - 3 freq
screenge - 3 freq
screengers - 1 freq
skrinklan - 1 freq
screensaver's - 1 freq
screenshot - 2 freq
serengeti - 3 freq
surmised - 3 freq
sweirness - 5 freq
skrinks - 1 freq
skrink - 1 freq
skrinkie - 1 freq
skrank - 2 freq
schramsberg - 1 freq
scrunkelt - 1 freq
shrinkan - 4 freq
scronacht - 2 freq
surrouns - 3 freq
swarms - 2 freq
screensaver - 3 freq
sureness - 1 freq
skrinkin - 1 freq
scrunchie - 1 freq
surmeesin - 1 freq
scorns - 1 freq
soaring - 1 freq
scurrying - 1 freq
shrinks - 1 freq
seering - 1 freq
squaring - 1 freq
shrunken - 1 freq
shairnscleuch - 1 freq
'shrink - 1 freq
scoring - 7 freq
sjrnxdam - 1 freq
screenwash - 1 freq
sharonshannon - 1 freq
skewering - 1 freq
scarnach - 1 freq
suerankinsays - 1 freq
sarnies - 1 freq
screenshots - 1 freq
shrooms - 1 freq
serenesquirrel - 2 freq
MetaPhone code - SRMS
surmise - 3 freq
SURMISE
Time to execute Levenshtein function - 0.353272 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.619986 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.027639 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.074215 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.004237 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.