A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to initial in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
initial (0) - 18 freq
initials (1) - 9 freq
initially (2) - 4 freq
intil (2) - 549 freq
intiae (2) - 1 freq
instil (2) - 1 freq
in-til (2) - 1 freq
ineetial (2) - 1 freq
intill (2) - 125 freq
brithal (3) - 2 freq
inite (3) - 6 freq
india (3) - 21 freq
bnitam (3) - 1 freq
instead (3) - 171 freq
intilt (3) - 11 freq
instillt (3) - 1 freq
instaw (3) - 1 freq
ignitin (3) - 1 freq
indian (3) - 32 freq
ignitit (3) - 1 freq
digital (3) - 25 freq
insta (3) - 2 freq
intay (3) - 3 freq
litill (3) - 1 freq
instid (3) - 2 freq
initial (0) - 18 freq
intil (2) - 549 freq
initials (2) - 9 freq
ineetial (2) - 1 freq
ental (3) - 1 freq
natal (3) - 2 freq
ontil (3) - 53 freq
intl (3) - 1 freq
intul (3) - 1 freq
intill (3) - 125 freq
initially (3) - 4 freq
intiae (3) - 1 freq
instil (3) - 1 freq
until (3) - 477 freq
in-til (3) - 1 freq
uinitie (4) - 1 freq
oantil (4) - 1 freq
init (4) - 1 freq
unitit (4) - 29 freq
ainimal (4) - 1 freq
intrae (4) - 1 freq
entel (4) - 3 freq
leitil (4) - 1 freq
unitin (4) - 2 freq
nuptial (4) - 8 freq
SoundEx code - I534
indelible - 2 freq
intil - 549 freq
intil't - 41 freq
initials - 9 freq
intellect - 10 freq
indwaller - 1 freq
intelleck - 3 freq
indwallers - 9 freq
intult - 1 freq
indwallin - 1 freq
intill't - 1 freq
intelligence - 20 freq
intul - 1 freq
intill - 125 freq
inthelead - 1 freq
indulge - 3 freq
indulged - 3 freq
indulgit - 1 freq
intl - 1 freq
intellectual - 17 freq
intellectuals - 5 freq
intolerable - 3 freq
initial - 18 freq
intolerant - 2 freq
initiallie - 2 freq
intilt - 11 freq
initially - 4 freq
indulgin - 1 freq
intelligence' - 1 freq
in-til - 1 freq
intelligent - 12 freq
intelligint - 1 freq
intull - 5 freq
i'middle - 3 freq
intelligibeility - 5 freq
intilla - 1 freq
ineetial - 1 freq
intelligible - 5 freq
intelligibility - 2 freq
indelibly - 1 freq
intolerance - 4 freq
indwellers - 2 freq
indulgence - 4 freq
indelicate - 2 freq
intelligentsia - 5 freq
€˜intae-lecher-all - 1 freq
indwell - 1 freq
intellek - 2 freq
indolence - 1 freq
intellectually - 1 freq
‘intellectual - 1 freq
indulgent - 2 freq
indylive - 3 freq
indylassie - 5 freq
ihndls - 1 freq
intulectual - 1 freq
indylanp - 1 freq
intellect'll - 1 freq
indywildycat - 4 freq
MetaPhone code - INXL
initial - 18 freq
initiallie - 2 freq
initially - 4 freq
ineetial - 1 freq
INITIAL
Time to execute Levenshtein function - 0.365337 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.547226 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.061772 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.088145 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000748 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.