A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example sonsie

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to katja in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
katja (0) - 6 freq
kita (2) - 1 freq
iatj (2) - 1 freq
katt (2) - 1 freq
katy (2) - 18 freq
kathak (2) - 1 freq
katie (2) - 34 freq
cata (2) - 1 freq
raja (2) - 1 freq
patna (2) - 2 freq
karla (2) - 1 freq
ata (2) - 58 freq
matta (2) - 1 freq
watna (2) - 1 freq
ataa (2) - 7 freq
kanna (2) - 1 freq
baja (2) - 1 freq
nadja (2) - 1 freq
wadja (2) - 1 freq
kath (2) - 3 freq
karma (2) - 5 freq
kathy (2) - 4 freq
kappa (2) - 1 freq
kat' (2) - 1 freq
data (2) - 68 freq
katja (0) - 6 freq
kita (3) - 1 freq
kat' (3) - 1 freq
kathy (3) - 4 freq
kath (3) - 3 freq
katm (3) - 1 freq
kat (3) - 156 freq
katt (3) - 1 freq
katy (3) - 18 freq
kate (3) - 176 freq
iatj (3) - 1 freq
katie (3) - 34 freq
taj (4) - 2 freq
kott (4) - 1 freq
utj (4) - 1 freq
ket (4) - 1 freq
ktze (4) - 1 freq
kite (4) - 19 freq
tjo (4) - 2 freq
katiec (4) - 18 freq
kj (4) - 4 freq
kits (4) - 3 freq
kythe (4) - 65 freq
ktn (4) - 1 freq
kt (4) - 3 freq
SoundEx code - K320
kites - 6 freq
ketch - 9 freq
kittiwake - 2 freq
kate's - 26 freq
kythes - 64 freq
kiddies - 2 freq
kids - 87 freq
kitchie - 71 freq
katja - 6 freq
kitty's - 1 freq
ket's - 1 freq
kidz - 1 freq
kits - 3 freq
kïsts - 1 freq
kytes - 2 freq
kat's - 15 freq
'kat's - 1 freq
kittag - 1 freq
kuts - 1 freq
katsh - 1 freq
'katze - 1 freq
kyths - 5 freq
kitsch - 4 freq
kidds - 2 freq
kudos - 2 freq
kathak - 1 freq
kodak - 1 freq
kszydj - 1 freq
katiec - 18 freq
kdaz - 1 freq
keithc - 1 freq
kdz - 1 freq
kdeyg - 1 freq
kid's - 1 freq
ketts - 1 freq
kxdkh - 1 freq
ktsc - 1 freq
ktze - 1 freq
kdjya - 1 freq
kdiouwg - 1 freq
kawwhtaxw - 1 freq
MetaPhone code - KTJ
cottage - 49 freq
katja - 6 freq
gadjee - 1 freq
cottage' - 1 freq
coattage - 1 freq
cottige - 1 freq
gdewjo - 1 freq
KATJA
Time to execute Levenshtein function - 0.289055 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.523944 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.062443 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.038738 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000852 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.