A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ahint

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to index in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
index (0) - 16 freq
inded (1) - 1 freq
inde (1) - 4 freq
endet (2) - 42 freq
inver (2) - 1 freq
windea (2) - 2 freq
infek (2) - 1 freq
indy' (2) - 1 freq
indaed (2) - 3 freq
indy (2) - 57 freq
inbox (2) - 4 freq
indie (2) - 1 freq
inlet (2) - 1 freq
inner (2) - 44 freq
hinder (2) - 20 freq
inoez (2) - 1 freq
indies (2) - 6 freq
ide (2) - 3 freq
mindet (2) - 2 freq
kindey (2) - 1 freq
andes (2) - 1 freq
idee (2) - 17 freq
infer (2) - 1 freq
inge (2) - 1 freq
ended (2) - 70 freq
index (0) - 16 freq
inde (2) - 4 freq
inded (2) - 1 freq
india (3) - 21 freq
ande (3) - 4 freq
inda (3) - 1 freq
andes (3) - 1 freq
annex (3) - 4 freq
ended (3) - 70 freq
ineed (3) - 1 freq
under (3) - 429 freq
indeed (3) - 178 freq
nex (3) - 22 freq
indies (3) - 6 freq
ende (3) - 3 freq
fedex (3) - 1 freq
indy (3) - 57 freq
indaed (3) - 3 freq
indy' (3) - 1 freq
endet (3) - 42 freq
indie (3) - 1 freq
inbox (3) - 4 freq
need (4) - 1780 freq
andied (4) - 1 freq
endre (4) - 1 freq
SoundEx code - I532
industrial - 32 freq
indignantly - 10 freq
intoxicatin - 1 freq
indies - 6 freq
index - 16 freq
intact - 14 freq
industries - 14 freq
integrity - 13 freq
industry - 54 freq
indigestible - 1 freq
indicatin - 8 freq
indignance - 2 freq
intestate - 1 freq
indicated - 3 freq
intestines - 1 freq
intak - 2 freq
in'ts - 2 freq
intac - 2 freq
indicater - 1 freq
indicates - 13 freq
indicate - 9 freq
induced - 5 freq
inducted - 1 freq
indigestion - 1 freq
ints - 1 freq
indication - 4 freq
induction - 2 freq
'index-linked' - 2 freq
inte's - 13 freq
indignint - 2 freq
industries' - 1 freq
indigenous - 35 freq
indoctrination - 2 freq
inuits - 2 freq
induces - 1 freq
indignant - 8 freq
inmates - 1 freq
inadequate - 4 freq
indignity - 1 freq
indiscernible - 2 freq
inducements - 1 freq
indicatit - 4 freq
indisgestible - 1 freq
integral - 10 freq
indigo - 2 freq
inidcative - 1 freq
indicative - 2 freq
integratin - 2 freq
indispensable - 2 freq
intake - 4 freq
indignation - 9 freq
industrious - 2 freq
integrated - 3 freq
indications - 2 freq
indescrievable - 1 freq
indo-chinois - 1 freq
indestructible - 2 freq
inhauds - 3 freq
'industrial - 1 freq
in-twist- - 1 freq
indiginous - 2 freq
intaks - 1 freq
integration - 5 freq
intakkin - 1 freq
indicator - 1 freq
indistinctly - 1 freq
industrialisation - 1 freq
integrate - 5 freq
integratit - 3 freq
indecipherable - 1 freq
industrialised - 2 freq
intoxicating' - 1 freq
indeigenous - 1 freq
indict - 1 freq
intestine - 1 freq
inducin - 2 freq
€œintegrity - 1 freq
intack - 1 freq
indecent - 1 freq
indisputable - 1 freq
intoxicated - 1 freq
indiscretions - 1 freq
indyscotwales - 7 freq
inthisweather - 1 freq
‘industrial’ - 1 freq
indescretions - 1 freq
indyÂ’s - 1 freq
indykaila - 1 freq
iantakto - 1 freq
industrialvid - 1 freq
indyscotparty - 1 freq
indyscotland - 1 freq
inthechoir - 1 freq
iaindoesjokes - 2 freq
iainwhytesnp - 1 freq
'indigenous' - 1 freq
indycamp - 3 freq
indstatehapp - 2 freq
indigodreamspub - 1 freq
indoctrinated - 2 freq
indysoosie - 1 freq
imwatson - 1 freq
iantsaoir - 1 freq
indigofast - 1 freq
intoxicating - 1 freq
indoctrinators - 1 freq
iaindgordon - 1 freq
indigenousyouth - 1 freq
indigenouspeoples - 1 freq
indigenouscommunities - 1 freq
indyscotnews - 1 freq
ineedquiet - 1 freq
MetaPhone code - INTKS
index - 16 freq
intaks - 1 freq
INDEX
Time to execute Levenshtein function - 0.194856 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.329058 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.027748 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.036774 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000823 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.