A Corpus of 21st Century Scots Texts

Intro a b c d e f g h i j k l m n o p q r s t u v w x y z Texts Writers Statistics Top200 Search Compare

Levenshtein Distance

Enter a word to find nearest neighbouring words, for example ablow

- basic concord - pre-sorted concord - post-sorted concord - map and chronology - chronogrid - fine-grain concord -

Similar words to delete in Corpus

Levenshtein Double Levenshtein SoundEx MetaPhone Manually curated
delete (0) - 10 freq
delyte (1) - 4 freq
deleted (1) - 5 freq
delite (1) - 4 freq
deluge (2) - 2 freq
deet (2) - 24 freq
delft (2) - 2 freq
melee (2) - 3 freq
celeste (2) - 3 freq
dellt (2) - 3 freq
delytes (2) - 1 freq
elite (2) - 21 freq
delude (2) - 1 freq
delta (2) - 1 freq
denee (2) - 1 freq
deleer (2) - 1 freq
beleve (2) - 6 freq
debate (2) - 81 freq
deletit (2) - 3 freq
deece (2) - 1 freq
veleta (2) - 1 freq
depute (2) - 21 freq
delegate (2) - 3 freq
delt (2) - 3 freq
repete (2) - 1 freq
delete (0) - 10 freq
delite (1) - 4 freq
delyte (1) - 4 freq
delta (2) - 1 freq
delt (2) - 3 freq
delyt (2) - 3 freq
deleted (2) - 5 freq
veleta (3) - 1 freq
daelt (3) - 2 freq
depute (3) - 21 freq
dlt (3) - 1 freq
dealt (3) - 33 freq
relate (3) - 15 freq
delve (3) - 6 freq
debate (3) - 81 freq
delyts (3) - 1 freq
delegate (3) - 3 freq
deletit (3) - 3 freq
deet (3) - 24 freq
delft (3) - 2 freq
dellt (3) - 3 freq
dalt (3) - 1 freq
delytes (3) - 1 freq
elite (3) - 21 freq
deleer (3) - 1 freq
SoundEx code - D430
daily-day - 8 freq
dwalt - 4 freq
dealt - 33 freq
daled - 5 freq
delete - 10 freq
delt - 3 freq
dollt - 1 freq
dwallt - 4 freq
dwallit - 2 freq
dulled - 1 freq
dolled - 6 freq
dialled - 2 freq
doled - 3 freq
deludey - 1 freq
dellt - 3 freq
delite - 4 freq
dilled - 3 freq
delyte - 4 freq
day-auld - 1 freq
delyt - 3 freq
delude - 1 freq
dillt - 1 freq
doilt - 1 freq
delled - 2 freq
dalt - 1 freq
daelt - 2 freq
dolt - 1 freq
delta - 1 freq
dailiday - 1 freq
dailt - 1 freq
dallied - 1 freq
duality - 1 freq
delayed - 4 freq
dewalt - 1 freq
delayedÂ… - 2 freq
dlt - 1 freq
'dildo - 1 freq
MetaPhone code - TLT
tellt - 505 freq
telt - 1538 freq
till't - 13 freq
tell't - 3 freq
til't - 40 freq
daily-day - 8 freq
told - 133 freq
toilet - 91 freq
til'it - 1 freq
tauld - 11 freq
dealt - 33 freq
tilt - 30 freq
'telt - 1 freq
daled - 5 freq
delete - 10 freq
delt - 3 freq
tailed - 7 freq
dollt - 1 freq
tallied - 1 freq
tilled - 3 freq
telit - 1 freq
dulled - 1 freq
dolled - 6 freq
tyle't - 1 freq
tull't - 4 freq
dialled - 2 freq
doled - 3 freq
tiled - 2 freq
tolt - 138 freq
deludey - 1 freq
tould - 2 freq
tolled - 1 freq
dellt - 3 freq
delite - 4 freq
dilled - 3 freq
telled - 2 freq
toiled - 2 freq
delyte - 4 freq
day-auld - 1 freq
töllied - 1 freq
delyt - 3 freq
delude - 1 freq
tolta - 1 freq
toalt - 1 freq
towld - 5 freq
told' - 2 freq
dillt - 1 freq
doilt - 1 freq
toledo - 1 freq
delled - 2 freq
dalt - 1 freq
daelt - 2 freq
teelt - 1 freq
dolt - 1 freq
€˜telt - 1 freq
delta - 1 freq
tooled - 3 freq
€œtelt - 1 freq
€œtoilet - 1 freq
dailiday - 1 freq
dailt - 1 freq
dallied - 1 freq
tel't - 2 freq
duality - 1 freq
toult - 1 freq
teld - 2 freq
tlod - 1 freq
dlt - 1 freq
telttt - 1 freq
'dildo - 1 freq
tltu - 4 freq
DELETE
Time to execute Levenshtein function - 0.180254 milliseconds
The Levenshtein distance is the number of characters you have to replace, insert or delete to transform one word into another, its useful for detecting typos and alternative spellings
Time to execute Double Levenshtein function - 0.339482 milliseconds
In a stroke of genius, this runs the Levenshtein function twice, once without vowels and adds the distance together, giving double weight to consonants.
Time to execute SoundEx function - 0.028006 milliseconds
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling.
Time to execute MetaPhone function - 0.038502 milliseconds
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation.[1] It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar.
Time to execute Manually curated function - 0.000806 milliseconds
Manual Curation uses a lookup table / lexicon which has been created by hand which links words to their lemmas, and includes obvious typos and spelling variations. Not all words are covered.