A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ablow

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to zero in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
zero (0) - 9 freq
nero (1) - 1 freq
zeno (1) - 1 freq
aero (1) - 1 freq
pero (1) - 1 freq
hero (1) - 81 freq
herm (2) - 49 freq
eo (2) - 6 freq
meo (2) - 1 freq
hera (2) - 12 freq
fern (2) - 4 freq
cerd (2) - 6 freq
redo (2) - 1 freq
vers (2) - 1 freq
dere (2) - 207 freq
perv (2) - 3 freq
hert (2) - 762 freq
gert (2) - 4 freq
emo (2) - 3 freq
ber (2) - 4 freq
ervo (2) - 2 freq
gern (2) - 1 freq
demo (2) - 3 freq
zoo (2) - 40 freq
zeus (2) - 12 freq
zero (0) - 9 freq
zr (2) - 2 freq
zry (2) - 1 freq
hero (2) - 81 freq
zara (2) - 3 freq
zeno (2) - 1 freq
pero (2) - 1 freq
aero (2) - 1 freq
nero (2) - 1 freq
zeta (3) - 1 freq
heero (3) - 1 freq
gro (3) - 3 freq
ere (3) - 287 freq
peri (3) - 2 freq
era (3) - 30 freq
per (3) - 63 freq
ery (3) - 15 freq
ger (3) - 2 freq
very (3) - 683 freq
pere (3) - 1 freq
mere (3) - 24 freq
rer (3) - 3 freq
siro (3) - 1 freq
eri (3) - 1 freq
ker (3) - 24 freq
SoundEx code - Z600
zero - 9 freq
zerah - 1 freq
zarah - 1 freq
zara - 3 freq
€œzero - 1 freq
zr - 2 freq
zrr - 1 freq
zrae - 1 freq
zry - 1 freq
MetaPhone code - SR
sair - 772 freq
sure - 972 freq
sorry - 483 freq
saur - 12 freq
sour - 7 freq
sir- - 4 freq
sir - 356 freq
soor - 93 freq
'sir' - 1 freq
'sorry - 14 freq
saire - 4 freq
sare - 30 freq
sairy - 5 freq
sore - 53 freq
seer - 39 freq
'sair - 2 freq
'sir - 7 freq
sairie - 22 freq
sur - 22 freq
sorr - 1 freq
wycer - 6 freq
suir - 11 freq
soarry - 4 freq
ceri - 1 freq
sorrow - 29 freq
sairrie - 3 freq
sorra - 35 freq
syria - 11 freq
soiree - 5 freq
x-ray - 7 freq
sarah - 40 freq
cerry - 7 freq
soir - 1 freq
sire - 8 freq
siura - 1 freq
ceoor - 1 freq
soary - 3 freq
soarey - 1 freq
soaree - 1 freq
sour' - 1 freq
soar - 10 freq
zero - 9 freq
'sure - 6 freq
sure' - 2 freq
surrey - 2 freq
sear - 1 freq
sorr-ee - 1 freq
zerah - 1 freq
sorrie - 2 freq
sooer - 1 freq
zarah - 1 freq
sarry - 2 freq
sere - 3 freq
soarro - 3 freq
soaroo - 2 freq
sara - 15 freq
sierra - 1 freq
sirrah - 1 freq
sar - 1 freq
serr - 9 freq
sorrae - 2 freq
zara - 3 freq
cer - 1 freq
suree - 1 freq
ser - 19 freq
suire - 2 freq
€˜sorry - 11 freq
€˜sure - 3 freq
sor- - 1 freq
sri - 11 freq
xr - 3 freq
cer- - 1 freq
€œsorry - 10 freq
€œsure - 1 freq
seer' - 1 freq
soeur - 1 freq
sru - 1 freq
€œzero - 1 freq
'sare - 1 freq
sari - 3 freq
sre - 2 freq
€œsairie - 1 freq
€”sorry - 1 freq
€˜sair - 2 freq
sr - 3 freq
soirée - 1 freq
zr - 2 freq
zrr - 1 freq
zrae - 1 freq
sairÂ’ - 1 freq
“sorry - 1 freq
sir” - 1 freq
sarahw - 1 freq
seery - 2 freq
soory - 2 freq
'sarah - 1 freq
hzr - 1 freq
ssre - 1 freq
zry - 1 freq
‘sorry - 1 freq
siro - 1 freq
xre - 1 freq
señor - 1 freq
ZERO
Time to execute Levenshtein function - 0.193676 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.343039 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.034609 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.038924 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000949 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.